I need each document in a collection to be updated only if its content is different, regardless of the order of the elements in nested lists.
Fundamentally, two versions should be the same if the elements are identical regardless of their order. MongoDB does not do that, by default.
def upsert(query, update):
# collection is a pymongo.collection.Collection object
result = collection.update_one(query, update, upsert=True)
print("\tFound match: ", result.matched_count > 0)
print("\tCreated: ", result.upserted_id is not None)
print("\tModified existing: ", result.modified_count > 0)
query = {"name": "Some name"}
update = {"$set": {
"products": [
{"product_name": "a"},
{"product_name": "b"},
{"product_name": "c"}]
}}
print("First update")
upsert(query, update)
print("Same update")
upsert(query, update)
update = {"$set": {
"products": [
{"product_name": "c"},
{"product_name": "b"},
{"product_name": "a"}]
}}
print("Update with different order of products")
upsert(query, update)
Output:
First update
Found match: False
Created: True
Modified existing: False
Same update
Found match: True
Created: False
Modified existing: False
Update with different order of products
Found match: True
Created: False
Modified existing: True
The last update does modify the document because the order of products are indeed different.
I did find a working solution which is to compare a sorting of the queried document's content and a sorting of the new one.
Thanks to Zero Piraeus's response for the short and convenient way to sort for comparison.
def ordered(obj):
if isinstance(obj, dict):
return sorted((k, ordered(v)) for k, v in obj.items())
if isinstance(obj, list):
return sorted(ordered(x) for x in obj)
else:
return obj
I apply it to compare the current and the new versions of the document. If their sorting are different, I apply the update.
new_update = {
"products": [
{"product_name": "b"},
{"product_name": "c"},
{"product_name": "a"}]
}
returned_doc = collection.find_one(query)
# Merging remote document with local dictionary
merged_doc = {**returned_doc, **new_update}
if ordered(returned_doc) != ordered(merged_doc):
upsert(query, {"$set": new_update})
print("Updated")
else:
print("Not Updated")
Output:
Not Updated
That works, but that relies on python to do the comparison, introducing a delay between the read and the write.
Is there a way to do it atomically ? Or, even better, a way to set a MongoDB Collection to adopt some kind of "order inside arrays doesn't matter" mode ?
This is part of a generic implementation. Documents can have any kind of nesting in their structure.
CodePudding user response:
One option is to use the query part of the update to work only on documents that are matching your condition:
db.collection.update(
{"name": "Some name",
"products.product_name": {
$not: {$all: ["a", "b", "c"]}}
},
{"$set": {
"products": [
{"product_name": "b"},
{"product_name": "c"},
{"product_name": "a"}
]
}}
)
See how it works on the playground example
CodePudding user response:
I'm uncertain about ordering using "$addToSet"
in an update_one
. I suppose it depends how the the set operations are implemented in the MongoDB server.
Anyway, here's an option to try that also preserves the upsert
option.
new_new_update = {
"$addToSet": {
"products": {
"$each": [
{
"product_name": "b"
},
{
"product_name": "c"
},
{
"product_name": "a"
}
]
}
}
}
Here's a mongoplayground.net example to demonstrate the concept. You can change the "name"
value to verify the upsert
option.
I'm curious what the result
values are for this update_one
.