Home > Software engineering >  What is the boto3 equivalent of aws s3 ls?
What is the boto3 equivalent of aws s3 ls?

Time:12-01

I am trying to replicate the command aws s3 ls s3://bucket/prefix/ using boto3. Currently, I am able to grab all the objects within the path using

s3 = boto3.client('s3')

bucket = "my-bucket"
prefix = "my-prefix"
paginator = s3.get_paginator('list_objects_v2')
page_iterator = paginator.paginate(Bucket=bucket, Prefix = prefix)

Then, I can iterate through page_iterator and manually reconstruct the top-level directories within that path. However, since there are a ton of objects inside the path, retrieving all the objects to reconstruct the results of this command takes roughly 30 seconds for me, whereas the AWS CLI command is pretty much instant. Is there a more efficient way to do this?

CodePudding user response:

You should use the Delimiter option of list_objects_v2 to group any objects with a common prefix together. This is basically what aws s3 ls does without the --recursive switch:

import boto3
s3 = boto3.client('s3')
bucket = "my-bucket"
prefix = "my-prefix"

paginator = s3.get_paginator('list_objects_v2')

# List all objects, group objects with a common prefix
for page in paginator.paginate(Bucket=bucket, Prefix=prefix, Delimiter="/"):
    # CommonPrefixes and Contents might not be included in a page if there
    # are no items, so use .get() to return an empty list in that case
    for cur in page.get("CommonPrefixes", []):
        print("<PRE> "   cur["Prefix"])
    for cur in page.get("Contents", []):
        print(cur["Size"], cur["Key"])
  • Related