Home > Software engineering >  MongoDB find minimum and maximum between grouped data
MongoDB find minimum and maximum between grouped data

Time:09-07

Will try to keep this concise with the input, result and desired/expected result. Need to find the minimum, maximum number of rows/records between the same "winCode" and the last time it occurred in the ordered data. So it makes me want to first group them by "winCode" which works perfectly, but I am not able to come up with something that would display how many records it took for the same "winCode" to appear last time, the minimum and maximum. Check desired output for more details. Below is the paste from: https://mongoplayground.net/p/bCzTO8ZLxNi

Input/collection

[
  {
    code: "1",
    results: {
      winCode: 3
    }
  },
  {
    code: "10",
    results: {
      winCode: 3
    }
  },
  {
    code: "8",
    results: {
      winCode: 2
    }
  },
  {
    code: "5",
    results: {
      winCode: 5
    }
  },
  {
    code: "5",
    results: {
      winCode: 4
    }
  },
  {
    code: "6",
    results: {
      winCode: 4
    }
  },
  {
    code: "7",
    results: {
      winCode: 5
    }
  },
  {
    code: "3",
    results: {
      winCode: 3
    }
  },
  {
    code: "9",
    results: {
      winCode: 2
    }
  },
  {
    code: "2",
    results: {
      winCode: 2
    }
  }
]

Current query

db.collection.aggregate([
  {
    $sort: {
      code: -1
    }
  },
  {
    $group: {
      _id: "$results.winCode",
      count: {
        $sum: 1
      },
      lastTimeOccurredCode: {
        $first: "$code" // Any way to get it to display a count from the start to this point on how many records it went through to get the $first result?
      },
      
    }
  },
  {
    $sort: {
      _id: -1
    }
  },
  
])

Current output

[
  {
    "_id": 5,
    "count": 2,
    "lastTimeOccurredCode": "5"
  },
  {
    "_id": 4,
    "count": 2,
    "lastTimeOccurredCode": "5"
  },
  {
    "_id": 3,
    "count": 3,
    "lastTimeOccurredCode": "1"
  },
  {
    "_id": 2,
    "count": 3,
    "lastTimeOccurredCode": "2"
  }
]

Desired output

[
  {
    "_id": 5,
    "count": 2,
    "lastTimeOccurredRecordsCount": 4,
    "minRecordsBetween": 3,
    "maxRecordsBetween": 3
  },
  {
    "_id": 4,
    "count": 2,
    "lastTimeOccurredRecordsCount": 5,
    "minRecordsBetween": 1,
    "maxRecordsBetween": 1
  },
  {
    "_id": 3,
    "count": 3,
    "lastTimeOccurredRecordsCount": 1,
    "minRecordsBetween": 1,
    "maxRecordsBetween": 6
  },
  {
    "_id": 2,
    "count": 3,
    "lastTimeOccurredRecordsCount": 3,
    "minRecordsBetween": 1,
    "maxRecordsBetween": 6
  }
]

I have tried to add an $accumulator function, but I would need the $first functions result in it, but it's not available at the same $group stage. Feel like I am missing something here.

CodePudding user response:

You can use $setWindowFields to define index and reduce to find the diff between them. If you want the index to be according to {$sort: {code: -1}}, then keep the $setWindowFields sortBy according to this example and remove the redundant {$sort: {code: -1}} step. If you want the index to be according to another sorting logic that only update the $setWindowFields sortBy.

  1. Use $setWindowFields to define index
  2. $sort according to your what you need (if it is different than the prev sort)
  3. $group according to the $results.winCode and keep all index data.
  4. Calculate the diff
  5. Format
db.collection.aggregate([
  {$setWindowFields: {
      sortBy: {code: -1},
      output: {index: {$sum: 1, window: {documents: ["unbounded", "current"]}}}
  }},
  {$sort: {code: -1}},
  {$group: {
      _id: "$results.winCode",
      count: {$sum: 1},
      lastTimeOccurredCode: {$first: "$code"},
      index: {$push: "$index"}
  }},
  {$project: {
      count: 1,
      lastTimeOccurredCode: 1,
      diff: {
        $reduce: {
          input: {$range: [1, {$size: "$index"}]},
          initialValue: [],
          in: {$concatArrays: [
              "$$value",
              [{$subtract: [
                    {$arrayElemAt: ["$index", "$$this"]},
                    {$arrayElemAt: ["$index", {$subtract: ["$$this", 1]}]}
              ]}]
            ]
          }
        }
      }
  }},
  {$set: {
      minRecordsBetween: {$min: "$diff"},
      maxRecordsBetween: {$max: "$diff"},
      diff: "$$REMOVE"
  }},
  {$sort: {_id: -1}}
])

See how it works on the playground example

  • Related