I understand using boto3 Object.copy_from(...) uses threads but is not asynchronous. Is it possible to make this call asynchronous? If not, is there another way to accomplish this using boto3? I'm finding that moving hundreds/thousands of files is fine, but when i'm processing 100's of thousands of files it gets extremely slow.
CodePudding user response:
You can have a look at aioboto3. It is a third party library, not created by AWS, but it provides asyncio
support for selected (not all) AWS API calls.
CodePudding user response:
I think you can use boto3 along with python threads to handle such cases, In AWS S3 Docs they mentioned
Your application can achieve at least 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per second per prefix in a bucket.
So you can make 3500 uploads by one call, nothing can override this 3500 limit set by AWS.
By using threads you need just 300 (approx) calls.
It Takes 5 Hrs in Worst Case i.e considering that your files are large, will take 1 min for a file to upload on average.
Note: Running more threads consumes more resources on your machine. You must be sure that your machine has enough resources to support the maximum number of concurrent requests that you want.