Is there a limit on the number of simultaneous/concurrent read/write operations on a single file stored in AWS S3? I'm thinking of designing a solution which requires parallel processing of some large amount of data stored in a single file which means there will be multiple read/write operations at any given point on time. I want to understand if there is a limit for this. Consistency of data is not required for me.
CodePudding user response:
S3 doesn't sounds like the ideal service for your requirement. S3 is object storage. This is an oversimplification, but it basically means that you're dealing with the entire file. Clients don't go into S3 and read/write directly into it.
When a client "reads" a file on S3, it essentially has to retrieve a copy of that file into the memory of the client device and read it from there. This way, it can handle thousands of parallel reads (up to 5,500 requests according to this announcement).
Similarly, writing to S3 means creating a new copy/version of a file. There is no way to write to a file in-place. So while S3 can support a large number of concurrent writes in general, multiple clients can't write to the same copy of a file.
Maybe EFS might fit your requirement, but I think a database designed for this sort of performance would be a better option.
CodePudding user response:
Reads: Multiple concurrent reads OK. The request limit is 5,500 GET/HEAD requests per second per partitioned prefix.
Writes: For object PUT and DELETE (3,500 RPS), strong read-after-write consistency. Last writer wins, no locking:
If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins.
The docs have several concurrent application examples. See also Best practices design patterns: optimizing Amazon S3 performance.
EFS is a file-based storage option for concurrent access uses cases. See the Comparing Amazon Cloud Storage table from the docs.