Home > Software engineering >  Delete files in s3 bucket folder
Delete files in s3 bucket folder

Time:10-19

I have a script to Delete files from s3 Bucket and the files from sub folders too and it will not delete the directories.

But it is deleting the directories/folders also if that folder is uploaded instead of creating in s3 itself.

 example : s3 Bucket name  : Test
                              Subfolder1 (created in s3 )
                              subfolder2  ( created in s3 )
                              subfolder3( uploaded )

So when I ran my script it is deleting subfolder 3 and the files inside it too. I don't want to delete sub folder 3, need to delete just the files inside those folders.

   for obj in bucket.objects.all():
   if not obj.key.endswith('/'):
      print(obj.key)
      s3.Object(bucket.name,obj.key).delete()

CodePudding user response:

The first thing to realise is that folders do not actually exist in Amazon S3. As per your example, an object could be uploaded to s3://bucket/subfolder3/foo.txt and the subfolder3 folder will magically 'appear'. When that object is deleted, the folder will 'disappear' (because it never actually existed).

The best way to use S3 is to simply don't worry about folders. They are provided merely as a way of grouping objects into Prefixes and are not required.

When a user 'creates' a folder in the S3 management console by clicking the Create folder button, a zero-length object is created with the same name as the folder. This forces the folder to 'appear' in S3 even though there are no objects in the path. The zero-length object effectively is the folder. This is why you are receiving different results in your code.

If you want to retain folders that were manually created, do not delete the zero-length objects that have a slash at the end of their name. This will retain the appearance of those folders.

In your case, subfolder3 will always disappear because a zero-length object was never created. So, either create one or simply do not worry about folders because they don't actually serve a specific purpose in S3.

CodePudding user response:

Your script is not actually deleting any folders. The folders that you think were deleted never existed in the first place.

Let's say that you start with an empty S3 bucket and you upload 3 files with the following keys:

  1. cats/siberian.png
  2. dogs/akita.png
  3. dogs/poodle.png

At this point, you will have 3 objects in S3. You appear to have two folders (cats/ and dogs/) but you don't really have these folders. If you delete the 3 objects then you will have zero objects (and zero folders). The folders that you thought you had didn't actually exist - they were inferred from the presence of objects with keys beginning cats/ and dogs/.

If you start again with an empty S3 bucket, but this time you use the AWS S3 Console to create folders cats/ and dogs/ and then upload the same three files listed above, at this point you will have 5 objects:

  1. cats/
  2. cats/siberian.png
  3. dogs/
  4. dogs/akita.png
  5. dogs/poodle.png

You have 5 objects, 2 of which represent folders, and 3 of which represent files. The reason that you have 5 objects, not 3, is that when you asked the S3 Console to create 2 folders, it actually created 2 objects that appear to be folders. If you now delete the 3 PNG objects, you will still have 2 remaining objects:

  1. cats/
  2. dogs/

That is to say that you will appear to have two folders remaining.

Note at this point that you can delete these 2 'folder' objects using the S3 Console (or the awscli or any SDK). They are simply objects (albeit zero-sized and looking like folders because their keys end in /) and can be deleted just like you delete a regular S3 object.

  • Related