I'm writing an aggregate query for the following records and output.
Data:
[
{
"_id" : ObjectId("5f3b2626927b18001db86884"),
"collections" : [
Art, Craft
]
},{
"_id" : ObjectId("5f3b2626927b18001db86885"),
"collections" : [
Craft
]
},{
"_id" : ObjectId("5f3b2626927b18001db86886"),
"collections" : [
Apex, Art
]
},
...
]
Expected Output:
count of collections id
{
Art : 2,
Craft : 2,
Apex : 1
}
Right now, we are looping through the collection to calculate count for each collections as the desired output, but it is low in performance because this collection is consists of 10,000 of records.
So, I was thinking to build an aggregate query and if someone can help me to start or point towards a right direction that would be really appreciated. Thank you.
CodePudding user response:
$unwind
$group
$group
$replaceRoot
db.collection.aggregate([
{
$unwind: "$collections"
},
{
"$group": {
"_id": "$collections",
"v": {
"$sum": 1
}
}
},
{
"$group": {
"_id": null,
"collections": {
"$push": {
$arrayToObject: [
[ { "k": "$$ROOT._id", "v": "$$ROOT.v" } ]
]
}
}
}
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: "$collections"
}
}
}
])
CodePudding user response:
I have figured a solution after checking for a while.
db.getCollection("collectionName").aggregate(
[
// get all the records with at least one collection name
{
$match: {
"collections.0": { $exists: true }
}
},
// populate the collection record
{
$lookup: {
from: "from_collection",
localField: "localField",
foreignField: "foreignField",
as: "collections"
}
},
// unwind
{ $unwind: "$collections" },
// group by the collections._id
{ $group: { _id: "$collections._id", collections: { $push: "$$ROOT.ID" } } },
// project with collection contains _id, and count
{
$project : {
collections: "$collections",
count: { $size: "$collections" }
}
}
]
).toArray();
output:
[
{
"_id" : ObjectId("61c4c42d68579f00311dd3e1"),
"collections" : [
"015151",
"015152",
"015153"
],
"count" : 3.0
},
{
"_id" : ObjectId("615f38016f40710033699939"),
"collections" : [
"014871"
],
"count" : 1.0
},
{
"_id" : ObjectId("611fed5ee0d12c00337cb009"),
"collections" : [
"014788",
"014786",
"014789",
"014787",
"014884",
"014893",
"014967",
"014968",
"015016",
"015017"
],
"count" : 10.0
}
...
]