I am facing a problem with MongoDB. I need to count all documents grouped by fields with the highest value of another field.
I'll give you an example:
{ id: 1, scanner: 'bandit', severity: 'low', version: 1 },
{ id: 2, scanner: 'bandit', severity: 'low', version: 1 },
{ id: 3, scanner: 'bandit', severity: 'medium', version: 1 },
{ id: 4, scanner: 'bandit', severity: 'medium', version: 2 },
{ id: 5, scanner: 'bandit', severity: 'high', version: 2 },
{ id: 6, scanner: 'semgrep', severity: 'critical', version: 2 },
{ id: 7, scanner: 'semgrep', severity: 'critical', version: 3 },
{ id: 8, scanner: 'semgrep', severity: 'info', version: 4 },
{ id: 9, scanner: 'semgrep', severity: 'info', version: 4 },
In this case we have two scanners (bandit
and semgrep
).
The highest version for "bandit" documents is 2
while the highest version for "semgrep" documents is 4
.
I need to group them by scanner and severity depending on their highest version, ignoring the previous versions, and count them.
The result should be similar to this:
[
{ "scanner": "bandit", "severity": "medium", "count": 1, "version": 2 },
{ "scanner": "bandit", "severity": "high", "count": 1, "version": 2 },
{ "scanner": "semgrep", "severity": "info", "count": 2, "version": 4 }
]
I'm struggling, I arrive to a point when I cannot ignore the former versions. Any help is appreciated.
Many thanks
CodePudding user response:
With MongoDB v5.0 ,
you can use $setWindowFields
to compute rank for the scanners
grouping. Then, with the highest rank(i.e. rank = 1), you can do another $group
and count.
db.collection.aggregate([
{
"$setWindowFields": {
"partitionBy": "$scanner",
"sortBy": {
"version": -1
},
"output": {
"rank": {
"$rank": {}
}
}
}
},
{
$match: {
"rank": 1
}
},
{
$group: {
_id: {
"scanner": "$scanner",
"severity": "$severity",
"version": "$version"
},
count: {
$sum: 1
}
}
},
{
$project: {
"_id": 0,
"scanner": "$_id.scanner",
"severity": "$_id.severity",
"version": "$_id.version",
"count": 1
}
}
])
Here is the Mongo Playground for your reference.