Wanted to find the sum of Nested json array via mongodb query:
$group is asking for _id for grouping each obj in array, which is not working.
The documents is like this :
{
point : "3",
created_date : "2022-06-19",
"name" : "grpA",
"value_format" : [
{
"data" : {
"A" : 10,
"B" : 20,
"C" : 30,
"D" : 40,
},
"key" : "key1"
},
"data" : {
"A" : 50,
"B" : 60,
"C" : 70,
"D" : 80,
},
"key" : "key2"
}
]
}
Expected output :
{
"key" : "anything",
"val" : {
"sumA": 60,
"sumB": 80,
"sumC": 100,
"sumD": 120,
}
}
CodePudding user response:
Query1
- assuming that A,B,C,D is part of your schema and not unknown data
- and assuming that you want the sum for each document you can do this
aggregate(
[{"$set":
{"sumA": {"$sum": "$value_format.data.A"},
"sumB": {"$sum": "$value_format.data.B"},
"sumC": {"$sum": "$value_format.data.C"},
"sumD": {"$sum": "$value_format.data.D"}}}])
Query2
- assuming dynamic fields (data on fields(unknown schema) => complicated and slow query)
- and want to sum all collection not just in 1 doc
- unwind the
value_format
- make all pair documents A:10 to {"k" : A, "v" 10}
- we need this to group by "k" (this is the problem with dynamic keys)
- unwind data
- group by this k
- and then back again to have the
sumA : ...
arrayToObject
- group by nil, and collect all in 1 document
*most likely you need query2, the complexity comes from the data on schema, and this is kinda fast way, it could be worse
aggregate(
[{"$unwind": "$value_format"},
{"$project": {"data": {"$objectToArray": "$value_format.data"}}},
{"$unwind": "$data"},
{"$group": {"_id": "$data.k", "sum": {"$sum": "$data.v"}}},
{"$replaceRoot":
{"newRoot":
{"$arrayToObject":
[[{"k": {"$concat": ["sum", "$_id"]}, "v": "$sum"}]]}}},
{"$group": {"_id": null, "val": {"$mergeObjects": "$$ROOT"}}}])
Query3
- this sums with unknown fields(like query2) but finds the sums for each document
- uses one extra collection with 1 empty document
one_doc_collection=[{}]
- this is like a trick, when we want to use stage operators to an array, and we dont want to unwind all documents (we use the lookup and unwind locally)
- query is similar to query 2, just do those inside the lookup its big because data on fields, both in the original data and in the output
aggregate(
[{"$project":
{"value_format":
{"$map":
{"input": "$value_format",
"in": {"$objectToArray": "$$this.data"}}}}},
{"$set":
{"value_format":
{"$reduce":
{"input": "$value_format",
"initialValue": [],
"in": {"$concatArrays": ["$$value", "$$this"]}}}}},
{"$lookup":
{"from": "one_doc_collection",
"pipeline":
[{"$set": {"value_format": "$$value_format"}},
{"$unwind": "$value_format"},
{"$group":
{"_id": "$value_format.k", "sum": {"$sum": "$value_format.v"}}},
{"$replaceRoot":
{"newRoot":
{"$arrayToObject":
[[{"k": {"$concat": ["sum", "$_id"]}, "v": "$sum"}]]}}}],
"as": "sums",
"let": {"value_format": "$value_format"}}},
{"$project":
{"sums":
{"$reduce":
{"input": "$sums",
"initialValue": {},
"in": {"$mergeObjects": ["$$value", "$$this"]}}}}}])