Home > Net >  Check file permissions for each file on a S3 Bucket, recursive
Check file permissions for each file on a S3 Bucket, recursive

Time:05-19

I need a script in Python to get all ACL for each files in a s3 bucket, to see if there are public o private files in that bucket. All files are images, and Marketing dept wanna know which files are Private.

Something like this

get_acl(object, bucket, ...)

But recursive for all 10.000 files in that bucket. With the AWS CLI i cant get this work, any idea where i can find some examples?

Thanks

CodePudding user response:

When the objects in the bucket are public you should get a 200 code, but if they are private the code will be 403.

So what you could try first is to get the list of all the objects in your bucket:

aws2 s3api list-objects --bucket bucketnamehere

So in python you could iterate a request to each of the objects, example:

https://bucketname.s3.us-east-1.amazonaws.com/objectname

You can do the test with the Unix command line Curl

curl -I https://bucketname.s3.us-east-1.amazonaws.com/objectname

CodePudding user response:

As you state, you need to list all of the objects in the bucket, and either check their ACL, or test to see if you can access the object without authentication.

If you want to check the ACLs, you can run through each object in turn and check:

BUCKET = "example-bucket"

import boto3

s3 = boto3.client('s3')
paginator = s3.get_paginator('list_objects_v2')
# List all of the objects
for page in paginator.paginate(Bucket=BUCKET):
    for cur in page.get("Contents", []):
        # Get the ACL for each object in turn
        # Note: This example does not take into
        # account any bucket-level permissions
        acl = s3.get_object_acl(Bucket=BUCKET, Key=cur['Key'])
        public_read = False
        public_write = False
        # Check each grant in the ACL
        for grant in acl["Grants"]:
            # See if the All Users group has been given a right, keep track of 
            # all possibilites in case there are multiple rules for some reason
            if grant["Grantee"].get("URI", "") == "http://acs.amazonaws.com/groups/global/AllUsers":
                if grant["Permission"] in {"READ", "FULL_CONTROL"}:
                    public_read = True
                if grant["Permission"] in {"WRITE", "FULL_CONTROL"}:
                    public_read = True

        # Write out the status for this object
        if public_read and public_write:
            status = "public_read_write"
        elif public_read:
            status = "public_read"
        elif public_write:
            status = "public_write"
        else:
            status = "private"
        print(f"{cur['Key']},{status}")
  • Related