is it possible to find all .json
files within S3 bucket
where the bucket itself can have multiple sub-directories ?
Actually my bucket includes multiple sub-directories where i would like to collect all JSON files inside it in order to iterate over them and parse specific key/values.
CodePudding user response:
Here's the solution (uses the boto module):
import boto3
s3 = boto3.client('s3') # Create the connection to your bucket
objs = s3.list_objects_v2(Bucket='my-bucket')['Contents']
files = filter(lambda obj: obj['Key'].endswith('.json'), objs) # json only
return files
The syntax for the list_objects_v2
function in boto3 can be found here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.list_objects_v2
Note that only the first 1000 keys are returned.