Home > Net >  MongoDB: get count of documents with the highest value between different groups
MongoDB: get count of documents with the highest value between different groups

Time:06-18

I am facing a problem with MongoDB. I need to count all documents grouped by fields with the highest value of another field.

I'll give you an example:

{ id: 1, scanner: 'bandit', severity: 'low', version: 1 },
{ id: 2, scanner: 'bandit', severity: 'low', version: 1 },
{ id: 3, scanner: 'bandit', severity: 'medium', version: 1 },
{ id: 4, scanner: 'bandit', severity: 'medium', version: 2 },
{ id: 5, scanner: 'bandit', severity: 'high', version: 2 },
{ id: 6, scanner: 'semgrep', severity: 'critical', version: 2 },
{ id: 7, scanner: 'semgrep', severity: 'critical', version: 3 },
{ id: 8, scanner: 'semgrep', severity: 'info', version: 4 },
{ id: 9, scanner: 'semgrep', severity: 'info', version: 4 },

In this case we have two scanners (bandit and semgrep).

The highest version for "bandit" documents is 2 while the highest version for "semgrep" documents is 4.

I need to group them by scanner and severity depending on their highest version, ignoring the previous versions, and count them.

The result should be similar to this:

[
    { "scanner": "bandit", "severity": "medium", "count": 1, "version": 2 },
    { "scanner": "bandit", "severity": "high", "count": 1, "version": 2 },
    { "scanner": "semgrep", "severity": "info", "count": 2, "version": 4 }
]

I'm struggling, I arrive to a point when I cannot ignore the former versions. Any help is appreciated.

Many thanks

CodePudding user response:

With MongoDB v5.0 ,

you can use $setWindowFields to compute rank for the scanners grouping. Then, with the highest rank(i.e. rank = 1), you can do another $group and count.

db.collection.aggregate([
  {
    "$setWindowFields": {
      "partitionBy": "$scanner",
      "sortBy": {
        "version": -1
      },
      "output": {
        "rank": {
          "$rank": {}
        }
      }
    }
  },
  {
    $match: {
      "rank": 1
    }
  },
  {
    $group: {
      _id: {
        "scanner": "$scanner",
        "severity": "$severity",
        "version": "$version"
      },
      count: {
        $sum: 1
      }
    }
  },
  {
    $project: {
      "_id": 0,
      "scanner": "$_id.scanner",
      "severity": "$_id.severity",
      "version": "$_id.version",
      "count": 1
    }
  }
])

Here is the Mongo Playground for your reference.

  • Related