Home > Software design >  MonoDB how to group by and count based on multiple fields
MonoDB how to group by and count based on multiple fields

Time:11-18

Given the a collection of documents in below format

[
  {
    "status": "RESOLVED",
    "transactionId": "123abc",
    "associatedId": "association-1",
    "timestamp": "2022-01-01"
  },
  {
    "status": "RESOLVED",
    "transactionId": "123abc",
    "associatedId": "association-1",
    "timestamp": "2022-01-02"
  },
  {
    "status": "NOT_RESOLVED",
    "transactionId": "456xyz",
    "associatedId": "association-1",
    "timestamp": "2022-01-02"
  },
  {
    "status": "RESOLVED",
    "transactionId": "456xyz",
    "associatedId": "association-1",
    "timestamp": "2022-01-02"
  },
  {
    "status": "RESOLVED",
    "transactionId": "456xyz",
    "associatedId": "association-1",
    "timestamp": "2022-01-01"
  },
  {
    "transactionId": "456xyz",
    "associatedId": "association-1",
    "timestamp": "2022-01-01"
  },
  {
    "status": "NOT_RESOLVED",
    "transactionId": "456xyz",
    "associatedId": "association-2",
    "timestamp": "2022-01-02"
  },
  {
    "status": "RESOLVED",
    "transactionId": "456xyz",
    "associatedId": "association-2",
    "timestamp": "2022-01-02"
  },
  {
    "transactionId": "456xyz",
    "associatedId": "association-2",
    "timestamp": "2022-01-01"
  }
]

trying to find total of statuses per transactionId, based on each associatedId irrespective of the timestamp and the grand total

so my output should be

| associatedId  | transactionId | RESOLVED_TOTAL | NOT_RESOLVED_TOTAL | ABSENT_TOTAL (i.e status field is absent in the document)| GRAND_TOTAL |
| association-1 | "123abc"      | 2              | 0                  | 0                                                        | 2           |
| association-1 | "456xyz"      | 2              | 1                  | 1                                                        | 4           |
| association-2 | "456xyz"      | 1              | 1                  | 1                                                        | 3           |

i'm trying to first use $project and set the status: 'ABSENT' where the status field is absent for all documents. And then, use $group first based on associatedId, transactionId, and status, and then use the second $group to create an array containing the total per associatedId and status, but i'm getting stuck. Not sure to proceed further. Highly appreciate any pointers.

CodePudding user response:

Use $ifNull to catch the absent case(i.e. field missing or field: null).

db.collection.aggregate([
  {
    "$group": {
      "_id": {
        associatedId: "$associatedId",
        transactionId: "$transactionId"
      },
      "RESOLVED_TOTAL": {
        $sum: {
          "$cond": {
            "if": {
              $eq: [
                "RESOLVED",
                "$status"
              ]
            },
            "then": 1,
            "else": 0
          }
        }
      },
      "NOT_RESOLVED_TOTAL": {
        $sum: {
          "$cond": {
            "if": {
              $eq: [
                "NOT_RESOLVED",
                "$status"
              ]
            },
            "then": 1,
            "else": 0
          }
        }
      },
      "ABSENT_TOTAL": {
        $sum: {
          "$cond": {
            "if": {
              $eq: [
                null,
                {
                  "$ifNull": [
                    "$status",
                    null
                  ]
                }
              ]
            },
            "then": 1,
            "else": 0
          }
        }
      },
      "GRAND_TOTAL": {
        $sum: 1
      }
    }
  },
  {
    $sort: {
      "_id.associatedId": 1,
      "_id.transactionId": 1
    }
  }
])

Mongo Playground

  • Related