The code below downloads files from an S3 bucket to a local directory.
import boto3
s3_client = boto3.client('s3')
response = s3_client.list_objects_v2(Bucket='MY-BUCKET', Prefix='foo/')
objects = sorted(response['Contents'], key=lambda obj: obj['LastModified'])
## Latest object
latest_object = objects[-1]['Key']
filename = latest_object[latest_object.rfind('/') 1:] # Remove path
# Download it to current directory
s3_client.download_file('MY-BUCKET', latest_object, filename)
The list_objects_v2
command only returns a maximum of 1000 objects. I'm aware paginator could be a solution for this, since the bucket in use has more objects. How can this be implemented in the above?
CodePudding user response:
There is a built-in class that you can use class S3.Paginator.ListObjectsV2
Here is how you can add paginator into your current code.
import boto3
s3_client = boto3.client('s3')
# Add paginator
paginator = s3_client.get_paginator('list_objects_v2')
# Use pagination
response = paginator.paginate(Bucket='MY-BUCKET', Prefix='foo/')
data = []
for r in response:
data = [c for c in r['Contents']]
print(data)