Thursday, October 3, 2013

How to delete _$folder$ file from AWS S3 directories ?

The _$folder$ file gets created win S3 directory structures because because of use of tools to interact with S3 file system (like S3fox).
The files are visible only with AWS S3 console or with s3cmd from CLI.
The files causes no harm to the file system, but if you want to delete it you can chose any of the way.
1. Delete it from AWS S3 console.
2. From CLI with S3cmd
 s3cmd del s3://<s3_bucket_name>/<dir_name>/_\$folder\$

I have deleted the files present recursively from S3 directories with following script :

 dir_list=`hadoop fs -ls s3://<s3_bucket_name>/<dir_name>/*/| cut -d' ' -f17 `  
  for dir in $dir_list  
  do  
     file_list=`s3cmd ls s3://<s3_bucket_name>${dir}/* | grep folder | cut -d' ' -f14`  
     for file in $file_list  
        do  
            s3cmd del `echo ${file} |sed -n 1'p' | tr '\$' '\\\$'`  
        done  
  done  

No comments:

Post a Comment