What options are there to ensure that changes on documents in a MongoDB are not lost in case of a system that works as follows: some workers process (in parallel) Tasks that consist of some sequential steps:
- read a document from MongoDB (based on some criteria dependent of task type)
- perform some changes on the document (the changes are pretty complex and require queries to some other services so something like db.collection.findAndModify() would not work)
- write the document back to MongoDB
If we consider the CRUD terminology we would have for each Task RU operations. The problem will appear in the case RRUU (on the same document) since the second (last) update will overwrite the first (previous) one(s)
The question is how to ensure that document changes are not lost assuming two or more tasks concomitantly make changes on the same document ?
CodePudding user response:
An approach that immediately comes to mind is versioning your documents.
Let each document have a field version
with, say, integer values.
- Worker 1 reads a document and sees that its version is
5
. - Worker 2 reads this same document.
- Worker 1 completes calculation of the new content for the document and performs a
findAndModify({id: document.id, version: 5}, { ...set document data here and increment the version...})
- Worker 2 completes calculation of the new content for the document and attempts to run the same
findAndModify({id: document.id, version: 5}, { ...different content...})
- Worker 2 fails to find the document for update, because
version
is no longer 5. It now can abort the write or re-read the document and retry.