How can I get the key of a downloaded S3 object using boto3 in Python?-CodePudding

I've downloaded a file from S3 and I'm passing the S3 response object to other functions.

I assume the key/filename must be stored somewhere on this object itself but I can't seem to find it. I don't want to pass the filename to every function that needs it.

All my Googling just shows how to get the name of a file from a bucket without downloading it, not how to get the filename from the response.

I'm using Python/Boto3:

def main():
    file = s3.Object("my cool bucket", "my cool file").get()
    process_file(file)

def process_file(file):
    print(file.name) 
    # how do I make this work w/o passing in filename as arg to original function

CodePudding user response：

The response of get_object(...) does not return the key ("filename") in the response object.

It returns the below properties, none of which is the key.

Unfortunately, you'll have to pass the key/filename that you used to get the object in the first place, to any other function which needs it.

{
    'Body': StreamingBody(),
    'DeleteMarker': True|False,
    'AcceptRanges': 'string',
    'Expiration': 'string',
    'Restore': 'string',
    'LastModified': datetime(2015, 1, 1),
    'ContentLength': 123,
    'ETag': 'string',
    'MissingMeta': 123,
    'VersionId': 'string',
    'CacheControl': 'string',
    'ContentDisposition': 'string',
    'ContentEncoding': 'string',
    'ContentLanguage': 'string',
    'ContentRange': 'string',
    'ContentType': 'string',
    'Expires': datetime(2015, 1, 1),
    'WebsiteRedirectLocation': 'string',
    'ServerSideEncryption': 'AES256'|'aws:kms',
    'Metadata': {
        'string': 'string'
    },
    'SSECustomerAlgorithm': 'string',
    'SSECustomerKeyMD5': 'string',
    'SSEKMSKeyId': 'string',
    'BucketKeyEnabled': True|False,
    'StorageClass': 'STANDARD'|'REDUCED_REDUNDANCY'|'STANDARD_IA'|'ONEZONE_IA'|'INTELLIGENT_TIERING'|'GLACIER'|'DEEP_ARCHIVE'|'OUTPOSTS',
    'RequestCharged': 'requester',
    'ReplicationStatus': 'COMPLETE'|'PENDING'|'FAILED'|'REPLICA',
    'PartsCount': 123,
    'TagCount': 123,
    'ObjectLockMode': 'GOVERNANCE'|'COMPLIANCE',
    'ObjectLockRetainUntilDate': datetime(2015, 1, 1),
    'ObjectLockLegalHoldStatus': 'ON'|'OFF'
}

CodePudding user response：

You can use below code to get path & file name separately:

bkt_obj = conn_s3.Bucket(bkt_name)
for obj in bkt_obj.objects.all():
    if obj.key[-1] != '/' and obj.key[-1] != '$':
        file = obj.key
        path, filename = os.path.split(obj.key)
        filename = os.path.basename(file)
        print(f"FILE: {obj.key} -> {path}  -> {filename}")

Output:

FILE: dir1/dir1_file.txt -> dir1  -> dir1_file.txt
FILE: mydatafile.csv ->   -> mydatafile.csv

You can add this after above code you want to get the exact path of downloaded file:

path_local_files='/home/user/s3_data/'
dest_file = os.path.join(path_local_files, filename)
print(f" --------> {dest_file}")  #use this variable if need of full path
#bkt_obj.download_file(obj.key, dest_file) #download file