I need a script in Python to get all ACL for each files in a s3 bucket, to see if there are public o private files in that bucket. All files are images, and Marketing dept wanna know which files are Private.
Something like this
get_acl(object, bucket, ...)
But recursive for all 10.000 files in that bucket. With the AWS CLI i cant get this work, any idea where i can find some examples?
Thanks
CodePudding user response:
When the objects in the bucket are public you should get a 200 code, but if they are private the code will be 403.
So what you could try first is to get the list of all the objects in your bucket:
aws2 s3api list-objects --bucket bucketnamehere
So in python you could iterate a request to each of the objects, example:
https://bucketname.s3.us-east-1.amazonaws.com/objectname
You can do the test with the Unix command line Curl
curl -I https://bucketname.s3.us-east-1.amazonaws.com/objectname
CodePudding user response:
As you state, you need to list all of the objects in the bucket, and either check their ACL, or test to see if you can access the object without authentication.
If you want to check the ACLs, you can run through each object in turn and check:
BUCKET = "example-bucket"
import boto3
s3 = boto3.client('s3')
paginator = s3.get_paginator('list_objects_v2')
# List all of the objects
for page in paginator.paginate(Bucket=BUCKET):
for cur in page.get("Contents", []):
# Get the ACL for each object in turn
# Note: This example does not take into
# account any bucket-level permissions
acl = s3.get_object_acl(Bucket=BUCKET, Key=cur['Key'])
public_read = False
public_write = False
# Check each grant in the ACL
for grant in acl["Grants"]:
# See if the All Users group has been given a right, keep track of
# all possibilites in case there are multiple rules for some reason
if grant["Grantee"].get("URI", "") == "http://acs.amazonaws.com/groups/global/AllUsers":
if grant["Permission"] in {"READ", "FULL_CONTROL"}:
public_read = True
if grant["Permission"] in {"WRITE", "FULL_CONTROL"}:
public_read = True
# Write out the status for this object
if public_read and public_write:
status = "public_read_write"
elif public_read:
status = "public_read"
elif public_write:
status = "public_write"
else:
status = "private"
print(f"{cur['Key']},{status}")