Home > Software design >  Merging multiple CSV files into a single file in S3
Merging multiple CSV files into a single file in S3

Time:05-15

I am looking to take a few files in S3 and merging all of them to one big file (All files have the same columns). Is it possible to achieve this without downloading the files? Directly on S3 using Pyhton? Without using ECS/Lambda.

I have seen that "UploadPartCopy" and "UploadPart" might help, although I am not sure.

I will note that file sizes may vary, from 500KB to 27MB.

Used to do this by taking the files from S3 and concating into one big DataFrame and then uploading again to S3. This worked well, but the machine started to crash as it got too many files...

Thanks

CodePudding user response:

I am looking to take a few files in S3 and merging all of them to one big file (All files have the same columns). Is it possible to achieve this without downloading the files? Directly on S3 using Pyhton? Without using ECS/Lambda.

No, this isn't possible. S3 doesn't run code for you at all, it just stores files. You can't run Python code on S3 at all, or any other programming language for that matter.

  • Related