Home > Software engineering >  Is there a way to make ListObjectsV2 to validate s3 object type/extension?
Is there a way to make ListObjectsV2 to validate s3 object type/extension?

Time:11-09

I am checking if s3 location exists using ListObjectsV2 in python3.

I have use case where it needs to validate file type or extension of s3 object.

import boto3
s3 = boto3.client("s3")
bucket="bucketName"
key="folder1/folder2/myObject.csv"
res = s3.list_objects_v2(Bucket=bucket, Prefix=key)
print(res)
print(res.get("KeyCount"))
if res.get("KeyCount") > 0: print("s3 object exists")
else: print("s3 object does not exists")

Example object is below: s3://bucketName/folder1/folder2/myObject.csv

These below scenarios giving output:

  1. s3://bucketName/folder1/folder2/myObject.csv gives "s3 object exists"
  2. s3://bucketName/folder1/folder2/myObject.c gives "s3 object exists"
  3. s3://bucketName/folder1/folder2/myObject. gives "s3 object exists"
  4. s3://bucketName/folder1/folder2/myObject.x gives "s3 object does not exists"

I'm observing partial extension is also being validating.

I want 2,3 also as "s3 object does not exists". I might have many extensions, so can't use "ends with". May be I can try with parser which divides s3path after "." for extension and use with ends with. but I am looking for simpler way.

Thanks in Advance.

CodePudding user response:

ListObject "Limits the response to keys that begin with the specified prefix." This is exactly what you're seeing, since "folder1/folder2/myObject.csv" starts with "folder1/folder2/myObject."

If you want to see if an object exists, you need an API that operates on a specific object. One such option is to call HeadObject and see if it fails with an invalid key:

import botocore.exceptions

def does_key_exist(s3, bucket, key):
    try:
        # Try to head the object
        s3.head_object(Bucket=bucket, Key=key)
        # All good
        return True
    except botocore.exceptions.ClientError as e:
        # See if the failure is because the object doesn't exist
        code = e.response['Error']['Code']
        if code == "NoSuchKey" or code == "404":
            return False
        else:
            # Some other error, let the caller handle it if they want
            raise
  • Related