I have a dataset like this in mongodb
[
{
"task_id": "as4d2rds5",
"url": "https:example1.com",
"organization": "Avengers",
"val": "null"
},
{
"task_id": "rfre43fed",
"url": "https:example1.com",
"organization": "Avengers",
"val": "valid"
},
{
"task_id": "uyje3dsxs",
"url": "https:example2.com",
"organization": "Metro",
"val": "valid"
},
{
"task_id": "ghs563vt6",
"url": "https:example1.com",
"organization": "Avengers",
"val": "invalid"
},
{
"task_id": "erf6egy64",
"url": "https:example2.com",
"organization": "Metro",
"val": "null"
}
]
I am trying to create an mongodb aggregate function so that it will yield a result like
[
{
"Metro": {
"invalid": 0,
"null": 1,
"valid": 1,
"url": "https:example2.com"
},
},
{
"Avengers": {
"invalid": 1,
"null": 1,
"valid": 1,
"url": "https:example1.com"
}
}
]
I got a great deal of help from stackoverflow to reach here.
I need to reformat the data received from an aggregator so that it produces the above result. Present aggregation script is
db.collection.aggregate([ {"$group": {"_id": {"k": "$organization","v": "$val"},"cnt": {"$sum": 1},"url": {$first: "$url"}}},
{"$project": {"_id": 0, "url": 1, "k": "$_id.k", "o": {"k": "$_id.v", "v": "$cnt"}}},
{"$group": {"_id": "$k", "v": { "$push": "$o"}, "url": {"$first": "$url"}}},
{"$addFields": {"v": {"$mergeObjects": [{"null": 0,"valid": 0,"invalid": 0},{"$arrayToObject": "$v"}]}}},
{"$project": {"_id": 0, "url": 1, "new": [{"k": "$_id","v": "$v"}]}},
{"$addFields": {"new": {"$mergeObjects": [{"$arrayToObject": "$new"},{"url": "$url"}]}}},
{"$replaceRoot": {"newRoot": "$new"}} ])
CodePudding user response:
You can try this query to avoid multiple $group
(as a trade off you have three $filter
but I think this is still better than a multiple $group
):
This query group by organization
and then use $project
to output the size of how many "valid", "invalid" and "null" exists.
Edit: Also you can add an extra step $replaceRoot
to get exactly the same output as you want.
db.collection.aggregate([
{
"$group": {
"_id": "$organization",
"val": {
"$push": "$val"
},
"url": {
"$first": "$url"
}
}
},
{
"$project": {
"_id": 0,
"organization": [
{
"k": "$_id",
"v": {
"url": "$url",
"invalid": {
"$size": {
"$filter": {
"input": "$val",
"cond": {
"$eq": [
"$$this",
"invalid"
]
}
}
}
},
"valid": {
"$size": {
"$filter": {
"input": "$val",
"cond": {
"$eq": [
"$$this",
"valid"
]
}
}
}
},
"null": {
"$size": {
"$filter": {
"input": "$val",
"cond": {
"$eq": [
"$$this",
"null"
]
}
}
}
}
}
}
]
}
},
{
"$replaceRoot": {
"newRoot": {
"$arrayToObject": "$organization"
}
}
}
])
Example here