I have an s3 bucket list with some objects in each one, I'm trying to retrieve the number of objects in each bucket, I'm stuck at here:
import boto3
client = boto3.client('s3')
bucket_list = [
'bkt-1',
'bkt-2',
'bkt-3'
]
for objs in bucket_list:
response = client.list_objects_v2(Bucket=bucket_list[0])
objs = response['KeyCount']
print(objs)
But looks like it does not iterate through all the list, it just gives me the number of the objects in the first bucket of my list and that same value the number of my items in my list, which is not the number of objects in each bucket:
4
4
4
When I need it to be like:
[
'bkt-1' 4,
'bkt-2' 2,
'bkt-3' 5
]
CodePudding user response:
By using bucket_list[0]
in your loop, you're asking for the list of items for the first bucket over and over again. Additionally, you're actually only asking for the number of items in each bucket's first page of results. This will never be more than 1000 items, regardless of the bucket's size. If you want to support larger buckets, you'll need to properly paginate through the results:
import boto3
client = boto3.client('s3')
# Create a paginator helper for list_objects_v2
paginator = client.get_paginator('list_objects_v2')
bucket_list = [
'bkt-1',
'bkt-2',
'bkt-3'
]
for objs in bucket_list:
# Keep a running total
count = 0
# Work through the response pages, add to the running total
for page in paginator.paginate(Bucket=objs):
count = page['KeyCount']
# Show the bucket name and number of objects
print(objs, count)
CodePudding user response:
Instead of
client.list_objects_v2(Bucket=bucket_list[0])
it should be
for bucket in bucket_list:
response = client.list_objects_v2(Bucket=bucket)
...