S3 bucket size for subset bucket names-CodePudding

How can I use a custom list of S3 bucket names from local file as sometimes it takes too long for large buckets or different storage class. not sure sometimes it doesn't show all S3 buckets?

with open('subsetbucketslist.txt') as f:
    allbuckets = f.read().splitlines()

How to use local file of buckets names as input?

By default it would list all buckets:

import boto3

total_size = 0
s3=boto3.resource('s3')

for mybucket in s3.buckets.all():
    mybucket_size=sum([object.size for object in boto3.resource('s3').Bucket(mybucket.name).objects.all()])
    print (mybucket.name, mybucket_size)

CodePudding user response：

If you want to calculate the size for particular buckets, then put those bucket names in your for loop:

import boto3

total_size = 0
s3 = boto3.resource('s3')

with open('subsetbucketslist.txt') as f:
    allbuckets = f.read().splitlines()

for bucket_name in allbuckets:
    mybucket_size = sum([object.size for object in boto3.resource('s3').Bucket(bucket_name).objects.all()])
    print (bucket_name, mybucket_size)

It's also worth mentioning that Amazon CloudWatch keeps track of bucket sizes (BucketSizeBytes). See: Metrics and dimensions - Amazon Simple Storage Service