Consider collection with following documents:
[
{
"_id": "3981396a-9fcb-4c24-976f-d500f20c4fab",
"entries": [
{
"key": "var1"
"value": "value1"
},
{
"key": "var1"
"value": "value11"
}
{
"key": "var2"
"value": "value2"
}
]
}
]
What would be the appropriate approach to de-duplicate entries for each document in collection. Query should at least find all of the documents with duplicated entries then manual looping over would be acceptable. Even better if it can be all done in single aggregation pipline.
Expected result is following:
[
{
"_id": "3981396a-9fcb-4c24-976f-d500f20c4fab",
"entries": [
{
"key": "var1"
"value": "value1"
},
{
"key": "var2"
"value": "value2"
}
]
}
]
CodePudding user response:
You can use $reduce
to perform conditional insert into a placeholder array. Append the current element if the key is not already inside. Finally replace the entries
array with the placeholder array.
db.collection.update({},
[
{
$set: {
entries: {
"$reduce": {
"input": "$entries",
"initialValue": [],
"in": {
"$cond": {
"if": {
"$in": [
"$$this.key",
"$$value.key"
]
},
"then": "$$value",
"else": {
"$concatArrays": [
"$$value",
[
"$$this"
]
]
}
}
}
}
}
}
}
],
{
multi: true
})
CodePudding user response:
Query
- you can also do it using stage operators with a bit tricky way, using a "local" unwind
- lookup with collection with 1 empty document
- this will allow you to use stage operators to manipulate the array members, like do a "local" unwind
- unwind inside the lookup pipeline, group by the key and keep only 1 value
*i don't suggest its the best way in your case, but it can be useful this "local" unwind
col.aggregate(
[{"$lookup":
{"from": "dummy_collection_with_1_empty_doc",
"pipeline":
[{"$set": {"entries": "$$entries"}},
{"$unwind": "$entries"},
{"$group":
{"_id": "$entries.key", "value": {"$first": "$entries.value"}}},
{"$project": {"_id": 0, "key": "$_id", "value": 1}}],
"as": "entries",
"let": {"entries": "$entries"}}}])