Home > Software design >  Find all JSON files within S3 Bucket
Find all JSON files within S3 Bucket

Time:12-14

is it possible to find all .json files within S3 bucket where the bucket itself can have multiple sub-directories ?

Actually my bucket includes multiple sub-directories where i would like to collect all JSON files inside it in order to iterate over them and parse specific key/values.

CodePudding user response:

Here's the solution (uses the boto module):

import boto3

s3 = boto3.client('s3')  # Create the connection to your bucket
objs = s3.list_objects_v2(Bucket='my-bucket')['Contents']

files = filter(lambda obj: obj['Key'].endswith('.json'), objs)  # json only 
return files

The syntax for the list_objects_v2 function in boto3 can be found here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Client.list_objects_v2

Note that only the first 1000 keys are returned.

  • Related