I have a Lambda function that I am using to replicate a certain file format based on a PUT event on a bucket to another bucket. No errors get thrown in the CloudWatch logs, but the code does not replicate the file. This is only happening on this key which is partitioned by the date.
Lambda Event
{
"Records": [
{
"s3": {
"s3SchemaVersion": "1.0",
"configurationId": "lasic2-artifacts",
"bucket": {
"name": "BUCKETNAME",
"arn": "arn:aws:s3:::BUCKETNAME"
},
"object": {
"key": "models/operatorai-model-store/lasic2/2022/03/08/10:21:05/artifacts.tar.gz"
}
}
}
]
}
Lambda Function
import boto3
from botocore.exceptions import ClientError
print("Loading function")
s3 = boto3.client("s3", region_name="us-east-1")
class NoRecords(Exception):
"""
Exception thrown when there are no records found from
s3:ObjectCreatedPut
"""
def get_source(bucket, key):
"""
Returns the source object to be passed when copying over the contents from
bucket A to bucket B
:param bucket: name of the bucket to copy the key to
:param key: the path of the object to copy
"""
return {
"Bucket": bucket,
"Key": key,
}
def process_record(
record,
production_bucket,
staging_bucket,
):
"""
Process individual records(example record can be found here
https://docs.aws.amazon.com/lambda/latest/dg/with-s3-example.html#test-manual-invoke)
:param record: a record from s3:ObjectCreated:Put
:param production_bucket: name of the production bucket which comes from
the records
:param staging_bucket: name of the staging bucket to save the key from the
production_bucket into
"""
key = record["s3"]["object"]["key"]
print(f"Key: \n{key}")
try:
s3_response = s3.get_object(Bucket=production_bucket, Key=key)
s3_object = s3_response["Body"].read()
copy_source = get_source(bucket=production_bucket, key=key)
s3.copy_object(
Bucket=staging_bucket,
Key=key,
CopySource=copy_source,
ACL="bucket-owner-full-control",
)
except ClientError as error:
error_code = error.response["Error"]["Code"]
error_message = error.response["Error"]["Message"]
if error_code == "NoSuchBucket":
print(error_message)
raise
except Exception as error:
print(f"Failed to upload {key}")
print(error)
raise
def lambda_handler(event, _):
print(f"Event: \n{event}")
records = event["Records"]
num_records = len(records)
if num_records == 0:
raise NoRecords("No records found")
record = records[0]
production_bucket = record["s3"]["bucket"]["name"]
staging_bucket = f"{production_bucket}-staging"
process_record(
record=record,
production_bucket=production_bucket,
staging_bucket=staging_bucket,
)
CodePudding user response:
The hint to the issue is in the event that you received:
"key": "models/operatorai-model-store/lasic2/2022/03/08/10:21:05/artifacts.tar.gz"
You can see the object key here is encoded. The documentation is explicit about this:
The s3 key provides information about the bucket and object involved in the event. The object key name value is URL encoded. For example, "red flower.jpg" becomes "red flower.jpg" (Amazon S3 returns "application/x-www-form-urlencoded" as the content type in the response).
Since all of the SDK APIs you can use in boto3 expect an unencoded string, you'll need to decode the object key as it comes in to the Lambda before using it:
import urllib.parse
# ....
key = record["s3"]["object"]["key"]
key = urllib.parse.unquote_plus(key)
print(f"Key: \n{key}")