I have the following dataset which I am looking to summarize within a report. Basically I am looking for status count (by status) and group for each key. I have tried several ways to completing this to no avail.
[
{
"_id": ObjectId("635f808302d5f6cb7c293298"),
"key": "scientific",
"custom_properties": [
{
"namespace": "common metadata",
"key": "status",
"scope": "public",
"value": "To be uploaded",
"inherited": null
},
{
"namespace": "common metadata",
"key": "version",
"scope": "public",
"value": "1.0",
"inherited": null
},
{
"namespace": "common metadata",
"key": "start date",
"scope": "public",
"value": "1642550400000",
"inherited": null
},
]
},
{
"_id": ObjectId("635f809d02d5f6cb7c29353c"),
"key": "contracts",
"custom_properties": [
{
"namespace": "common metadata",
"key": "status",
"scope": "public",
"value": "To be reviewed",
"inherited": null
},
{
"namespace": "common metadata",
"key": "version",
"scope": "public",
"value": "",
"inherited": null
},
{
"namespace": "contracts",
"key": "expiry date",
"scope": "public",
"value": "",
"inherited": null
},
{
"namespace": "common metadata",
"key": "start date",
"scope": "public",
"value": "",
"inherited": null
},
]
},
{
"_id": ObjectId("635f80a002d5f6cb7c293588"),
"key": "contracts",
"custom_properties": [
{
"namespace": "common metadata",
"key": "status",
"scope": "public",
"value": "To be uploaded",
"inherited": null
},
{
"namespace": "common metadata",
"key": "version",
"scope": "public",
"value": "",
"inherited": null
},
]
}
]
and I am looking to Group and Facet, I am uncertain as to which to complete first.
The desired output should summarize the status count and group by key, and should look similar to the following...
[
{
scientific: [{
'To be reviewed': 0,
'To be uploaded': 1
}],
contracts: [{
'To be reviewed': 1,
'To be uploaded': 1
}],
}
]
NOTE: both the keys and status values are dynamic. I.e. we could introduce a new status, and there will be several keys introduced over time.
CodePudding user response:
You can do this with the following pipeline:
All this requires is a couple of $group
stages to count, then some structure manipulation to get the required output
db.collection.aggregate([
{
$group: {
_id: {
key: "$key",
statusVal: {
"$getField": {
"field": "value",
"input": {
"$arrayElemAt": [
{
$filter: {
input: "$custom_properties",
cond: {
$eq: [
"$$this.key",
"status"
]
}
}
},
0
]
},
}
}
},
sum: {
$sum: 1
}
}
},
{
$group: {
_id: "$_id.key",
types: {
$push: {
k: "$_id.statusVal",
v: "$sum"
}
}
}
},
{
$group: {
_id: null,
root: {
$push: {
k: "$_id",
v: {
"$arrayToObject": "$types"
}
}
}
}
},
{
$replaceRoot: {
newRoot: {
"$arrayToObject": "$root"
}
}
}
])