Home > Software engineering >  How to copy and paste multiple objects in same S3 location to generate ObjectCreated notifications?
How to copy and paste multiple objects in same S3 location to generate ObjectCreated notifications?

Time:12-01

I have already thousands of objects in my S3 bucket. I have written a lambda function now which process them and it gets triggered when a file gets dropped in that S3 bucket. I would like to copy some objects with matching pattern and drop them in the same bucket to trigger my lambda. Currently, I am following below method which takes a lot of time.

import boto3, botocore
s3_source = boto3.resource('s3')
bucket_source = s3_source.Bucket('vistradata')

key_list = []
objs = list(bucket_source.objects.filter(Prefix='data/'))
for i in range(0, len(objs)):
     key_list.append(objs[i].key)

files = [i for i in key_list if 'mystring' in i]

def copy_data_from_s3(input_file):
    
    s3 = boto3.resource('s3')
    copy_source = {
        'Bucket': 'bucket',
        'Key': input_file
     }
    s3.meta.client.copy(copy_source, 'bucket', input_file)

for i in files:
    copy_data_from_s3(i)

Is there any better method using aws s3 sync or aws s3 cp? The examples I see online are copying data from one bucket to another and not in the same bucket. Thank you.

CodePudding user response:

Yes, you could run a command like this to force the notification to trigger.

aws s3 sync s3://mybucket/* s3://mybucket/folder/

That would copy all the files inside the bucket to a new folder inside the bucket and trigger notifications for each.

You could also run that first with notifications disabled, then run it in reverse if need be.

CodePudding user response:

You could skip the S3 copying altogether. Your existing for i in range loop can invoke your notification lambda directly for each file. This is what S3 does. Your event payload would be a stripped-down version of the S3 notification event with only the bucket, key or whatever fields you need.

This will be faster and cheaper than copying S3 objects, but who cares? If this is a one-time operation, your or @Coin-Graham approach will also get the job done.

  • Related