I tried to list all files in a bucket. Here is my code
import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('my_project')
for my_bucket_object in my_bucket.objects.all():
print(my_bucket_object.key)
it works. I get all files' names. However, when I tried to do the same thing on a folder, the code raise an error
import boto3
s3 = boto3.resource('s3')
my_bucket = s3.Bucket('my_project/data/') # add the folder name
for my_bucket_object in my_bucket.objects.all():
print(my_bucket_object.key)
Here is the error:
botocore.exceptions.ParamValidationError: Parameter validation failed:
Invalid bucket name "carlos-cryptocurrency-research-project/data/": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:(s3|s3-object-lambda):[a-z\-0-9]*:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-.]{1,63}$|^arn:(aws).*:s3-outposts:[a-z\-0-9] :[0-9]{12}:outpost[/:][a-zA-Z0-9\-]{1,63}[/:]accesspoint[/:][a-zA-Z0-9\-]{1,63}$"
I'm sure the folder name is correct and I tried replacing it with Amazon Resource Name (ARN) and S3 URI, but still get the error.
CodePudding user response:
You can't indicate a prefix/folder in the Bucket constructor. Instead use the client-level API and call list_objects_v2 something like this:
import boto3
client = boto3.client('s3')
response = client.list_objects_v2(
Bucket='my_bucket',
Prefix='data/')
for content in response.get('Contents', []):
print(content['Key'])
Note that this will yield at most 1000 S3 objects. You can use a paginator if needed.