I have a mongo collection with documents containing arrays:
{ item: "journal", qty: 25, tags: ["blank", "red"], dim_cm: [ 14, 21 ] },
{ item: "notebook", qty: 50, tags: ["red", "blank"], dim_cm: [ 14, 21 ] },
{ item: "paper", qty: 100, tags: ["red", "blank", "plain"], dim_cm: [ 14, 21 ] },
{ item: "planner", qty: 75, tags: ["blank", "red"], dim_cm: [ 22.85, 30 ] },
{ item: "postcard", qty: 45, tags: ["blue"], dim_cm: [ 10, 15.25 ] }
I would like to get a single array containing all distinct values, such as:
tags: ["blank", "red", blue"] and dim_cm: [14,21,22.85,30,10,15.25]
Is this possible with an aggregation pipeline?
CodePudding user response:
You can use $group
with $reduce
and $setIntersection
:
$group
all documents to create one array of arrays per key- flatten each array with
$reduce
and make it a set using$setIntersection
.
db.collection.aggregate([
{$group: {_id: null, tags: {$push: "$tags"}, dim_cm: {$push: "$dim_cm"}}},
{
$project: {
_id: 0,
tags: {
$setIntersection: [
{$reduce: {
input: "$tags",
initialValue: [],
in: {$concatArrays: ["$$value", "$$this"]}
}
}
]
},
dim_cm: {
$setIntersection: [
{$reduce: {
input: "$dim_cm",
initialValue: [],
in: {$concatArrays: ["$$value", "$$this"]}
}
}
]
}
}
}
])
See how it works on the playground example
Another way is:
db.collection.aggregate([
{$unwind: "$tags"},
{
$group: {
_id: null,
tags: {$addToSet: "$tags"},
dim_cm: {$addToSet: "$dim_cm"
}
},
{$unwind: "$dim_cm"},
{$unwind: "$dim_cm"},
{
$group: {
_id: null,
tags: {$first: "$tags"},
dim_cm: {$addToSet: "$dim_cm"}
}
}
])
Which you can split into two queries which will be much faster:
db.collection.aggregate([
{$unwind: "$tags"},
{
$group: {
_id: null,
tags: {$addToSet: "$tags"}
}
},
])
A 3rd option is:
db.collection.aggregate([
{
$project: {
_id: 0,
arr: {
$concatArrays: [
{$map: {input: "$tags", as: "item", in: {k: "tag", v: "$$item"}}},
{$map: {input: "$dim_cm", as: "item", in: {k: "dim_cm", v: "$$item"}}}
]
}
}
},
{$unwind: "$arr"},
{
$group: {
_id: null,
tags: {
$addToSet: {$cond: [{$eq: ["$arr.k", "tag"]}, "$arr.v", "$$REMOVE"]}
},
dim_cm: {
$addToSet: {$cond: [{$eq: ["$arr.k", "dim_cm"]}, "$arr.v", "$$REMOVE"]}
}
}
}
])
CodePudding user response:
Query
- put both in one array
- unwind
- at group time check the type(string or not) and put to the right group
*here types are different, in case they were the same type, we can do another trick like put the in an array of pairs [[tag,cm] ...]
where first would be the tag and the second would be the cm, or array of documents
For perfomance if you test it send how it went if you can
aggregate(
[{"$project": {"tags-dim": {"$concatArrays": ["$tags", "$dim_cm"]}}},
{"$unwind": "$tags-dim"},
{"$group":
{"_id": null,
"tags":
{"$addToSet":
{"$cond":
[{"$eq": [{"$type": "$tags-dim"}, "string"]}, "$tags-dim",
"$$REMOVE"]}},
"dim_cm":
{"$addToSet":
{"$cond":
[{"$eq": [{"$type": "$tags-dim"}, "string"]}, "$$REMOVE",
"$tags-dim"]}}}}])