Home > database >  Data loss in AWS Lambda function when downloading file from S3 to /tmp
Data loss in AWS Lambda function when downloading file from S3 to /tmp

Time:08-01

I have written a Lambda function in AWS to download a file from an S3 location to /tmp directory (local Lambda space). I am able to download the file however, the file size is changing here, not sure why?

    def data_processor(event, context):
        print("EVENT:: ", event)
        bucket_name = 'asr-collection'
        fileKey = 'cc_continuous/testing/1645136763813.wav'
    
        path = '/tmp'
        output_path = os.path.join(path, 'mydir')
        if not os.path.exists(output_path):
            os.makedirs(output_path)
    
        s3 = boto3.client("s3")
    
        new_file_name = output_path   '/'   os.path.basename(fileKey)
    
    
        s3.download_file(
            Bucket=bucket_name, Key=fileKey, Filename=output_path   '/'   os.path.basename(fileKey)
        )
    
        print('File size is: '   str(os.path.getsize(new_file_name)))
    
        return None

Output:

File size is: 337964

Actual size: 230MB downloaded file size is 330KB

I tried download_fileobj() as well Any idea how can i download the file as it is, without any data loss?

CodePudding user response:

The issue can be that the bucket you are downloading from was from a different region than the Lambda was hosted in. Apparently, this does not make a difference when running it locally.

Check your bucket locations relative to your Lambda region.

Make a note that setting the region on your client will allow you to use a lambda in a different region from your bucket. However if you intend to pull down larger files you will get network latency benefits from keeping your lambda in the same region as your bucket.

CodePudding user response:

Working with S3 resource instance instead of client fixed it.

s3 = boto3.resource('s3')

keys = ['TestFolder1/testing/1651219413148.wav']
for KEY in keys: 
    local_file_name = '/tmp/' KEY
    s3.Bucket(bucket_name).download_file(KEY, local_file_name)
  • Related