Home > Software engineering >  2 level mongo $group - aggregation
2 level mongo $group - aggregation

Time:09-07

Been trying to read the docs and examples to try and figure this out but it's proving difficult (especially as a noobie with little idea behind what all the different terms mean).

We have a couple of collections and are trying to quantify/chart some data from it. We have managed to massage the data via aggregation stages so that the documents look like this:

[
  {
    _id: 1, 
    sender: 'foo', 
    messages: [
      {_id: 10, text: 'hello', recipientCount: 3, sentAt: '2019-10'},
      {_id: 11, text: 'hello', recipientCount: 3, sentAt: '2019-10'},
      {_id: 12, text: 'hello', recipientCount: 3, sentAt: '2019-10'},
    ]
  },
  {
    _id: 2, 
    sender: 'bar', 
    messages: [
      {_id: 13, text: 'hello', recipientCount: 3, sentAt: '2018-10'},
      {_id: 14, text: 'hello', recipientCount: 3, sentAt: '2018-10'},
      {_id: 15, text: 'hello', recipientCount: 3, sentAt: '2018-10'},
    ]
  },
  {
    _id: 3, 
    sender: 'foo', 
    messages: [
      {_id: 16, text: 'hello', recipientCount: 13, sentAt: '2020-10'},
      {_id: 17, text: 'hello', recipientCount: 13, sentAt: '2020-10'},
      {_id: 18, text: 'hello', recipientCount: 13, sentAt: '2020-10'},
    ]
  },
  {
    _id: 4, 
    sender: 'foo', 
    messages: [
      {_id: 19, text: 'hello', recipientCount: 3, sentAt: '2021-10'},
      {_id: 110, text: 'hello', recipientCount: 3, sentAt: '2021-10'},
      {_id: 111, text: 'hello', recipientCount: 3, sentAt: '2021-10'},
    ]
  },
  {
    _id: 5, 
    sender: 'bar', 
    messages: [
      {_id: 112, text: 'hello', recipientCount: 1, sentAt: '2021-4'},
      {_id: 113, text: 'hello', recipientCount: 1, sentAt: '2021-4'},
      {_id: 114, text: 'hello', recipientCount: 1, sentAt: '2021-4'},
    ]
  },
  {
    _id: 6, 
    sender: 'foo', 
    messages: [
      {_id: 115, text: 'hello', recipientCount: 4, sentAt: '2020-8'},
      {_id: 116, text: 'hello', recipientCount: 4, sentAt: '2020-8'},
      {_id: 117, text: 'hello', recipientCount: 4, sentAt: '2020-8'},
    ]
  },
  {
    _id: 7, 
    sender: 'cap', 
    messages: [
      {_id: 118, text: 'hello', recipientCount: 7, sentAt: '2018-6'},
      {_id: 119, text: 'hello', recipientCount: 7, sentAt: '2018-6'},
      {_id: 120, text: 'hello', recipientCount: 7, sentAt: '2018-6'},
    ]
  },
  {
    _id: 8, 
    sender: 'cap', 
    messages: [
      {_id: 121, text: 'hello', recipientCount: 12, sentAt: '2019-11'},
      {_id: 122, text: 'hello', recipientCount: 12, sentAt: '2019-11'},
      {_id: 123, text: 'hello', recipientCount: 12, sentAt: '2019-11'},
    ]
  },
  {
    _id: 9, 
    sender: 'foo', 
    messages: [
      {_id: 124, text: 'hello', recipientCount: 2, sentAt: '2020-12'},
      {_id: 125, text: 'hello', recipientCount: 2, sentAt: '2020-12'},
      {_id: 126, text: 'hello', recipientCount: 2, sentAt: '2020-12'},
    ]
  },
  {
    _id: 10, 
    sender: 'foo', 
    messages: [
      {_id: 127, text: 'hello', recipientCount: 1, sentAt: '2021-1'},
      {_id: 128, text: 'hello', recipientCount: 1, sentAt: '2021-1'},
      {_id: 129, text: 'hello', recipientCount: 1, sentAt: '2021-1'},
    ]
  },
  {
    _id: 11, 
    sender: 'cap', 
    messages: [
      {_id: 130, text: 'hello', recipientCount: 2, sentAt: '2019-2'},
      {_id: 131, text: 'hello', recipientCount: 2, sentAt: '2019-2'},
      {_id: 132, text: 'hello', recipientCount: 2, sentAt: '2019-2'},
    ]
  },
]

We want output that will allow us to create a table showing the total recipientCount grouped by sender and sentAt. I.e. - Something along the lines of:

            2018-1 2018-2 2018-3 2018-4
foo           3       4      31     18
bar           7       24     11     17
cap           19      6      23     75

What would be the next (and final?) stage of this aggregation pipeline?

Thanks in advance!

CodePudding user response:

Just unwind the messages array and group on sender and sentAt.Like this:

db.collection.aggregate([
  {
    "$unwind": "$messages"
  },
  {
    "$group": {
      "_id": {
        sender: "$sender",
        sentAt: "$messages.sentAt"
      },
      "count": {
        "$sum": "$messages.recipientCount"
      }
    }
  },
  {
    "$project": {
      "sender": "$_id.sender",
      "sentAt": "$_id.sentAt",
      "count": 1,
      "_id": 0
    }
  }
])

Playground link.

  • Related