Let's say there are 10000 documents in a collection. I have 3 app nodes doing something with those documents. I want one document to only be processed once. How I've currently done it is that in app there's a loop which queries the collection with findOneAndUpdate
which finds document where claimed=false
and at the same time updates them to claimed=true
. It works, but the problem with this is querying documents one by one is slow. What I'd like to do is "find up to 100 documents where claimed=false
and at the same time update them to claimed=true
". I need this to be atomic to avoid race conditions where multiple app nodes claim the same document. But from Mongo's documentation I can't find anything like findManyAndUpdate()
. In SQL worlds it's basically select for update skip locked
. Is there something like this? Maybe I can utilise Mongo's transactions somehow?
CodePudding user response:
Assuming "find up to" a soft limit, you can run 2 queries:
db.collection.find({claimed:false}, {_id:1}).limit(100)
to get all _ids into an array ids
, then
db.collection.updateMany({claimed: false, _id: {$in: `ids`}}, {$set: {claimed: true}})
It will update 0 to 100 documents depending on concurrent updates.
UPDATE
I guess I missed the point that you actually need to retrieve the documents too, not only update them.
There is no options but update them individually. Select 100:
db.collection.find({claimed:false}).limit(100)
Then iterate for each _id:
db.collection.updateOne({_id: id, claimed:false}, {$set: {claimed:true}})
The result of each update contains modifiedCount
with value 1 or 0. Discard the documents that were not modified, they were claimed by the concurrent update.