There are a lot of similar questions but I don't find an answer to exactly on this question. How to get ALL sub-directories starting from an initial one. The depth of the sub-directories is unknown.
Lets say I have:
data/subdir1/subdir2/file.csv
data/subdir1/subdir3/subdir4/subdir5/file2.csv
data/subdir6/subdir7/subdir8/file3.csv
So I would like to either get a list of all sub-directories all length deep OR even better all the paths one level before the files. In my example I would ideally want to get:
data/subdir1/subdir2/
data/subdir1/subdir3/subdir4/subdir5/
data/subdir6/subdir7/subdir8/
but I could work with this as well:
data/subdir1/
data/subdir1/subdir2/
data/subdir1/subdir3/
data/subdir1/subdir3/subdir4/
etc...
data/subdir6/subdir7/subdir8/
My code so far only gets me one level deep of directories:
result = await self.s3_client.list_objects(
Bucket=bucket, Prefix=prefix, Delimiter="/"
)
subfolders = set()
for content in result.get("CommonPrefixes"):
print(f"sub folder : {content.get('Prefix')}")
subfolders.add(content.get("Prefix"))
return subfolders
CodePudding user response:
import os
# list_objects returns a dictionary. The 'Contents' key contains a
# list of full paths including the file name stored in the bucket
# for example: data/subdir1/subdir3/subdir4/subdir5/file2.csv
objects = s3_client.list_objects(Bucket='bucket_name')['Contents']
# here we iterate over the fullpaths and using
# os.path.dirname we get the fullpath excluding the filename
for obj in objects:
print(os.path.dirname(obj['Key'])