Home > other >  MongoDB - Get rank of the document based on frequency
MongoDB - Get rank of the document based on frequency

Time:01-03

[
    {_id: 1, query: 'A', createdAt: 1660610671 },
    {_id: 2, query: 'A', createdAt: 1660610672 },
    {_id: 3, query: 'A', createdAt: 1660610673 },
    {_id: 4, query: 'A', createdAt: 1660610674 },
    {_id: 5, query: 'B', createdAt: 1660610675 },
    {_id: 6, query: 'C', createdAt: 1660610676 },
    {_id: 7, query: 'C', createdAt: 1660610677 },
    {_id: 8, query: 'C', createdAt: 1660610678 },
    {_id: 9, query: 'D', createdAt: 1660610680 },
    {_id: 10, query: 'D', createdAt: 1660610681 },
]

I have the above database structure. I want to get rank from the frequency of the query value in a specific period.

Maybe it would be something like this.

Queries.getRank({ key: 'query', createdAt: {$gte: startUnix, $lt: endUnix } }) 

I expect the result as below.

Rank

[
    {rank: 1, query: 'A', frequency: 4},
    {rank: 2, query: 'C', frequency: 3},
    {rank: 3, query: 'D', frequency: 2},
    {rank: 4, query: 'B', frequency: 1}
]

Is there a way to achieve it? Thanks.

CodePudding user response:

  1. $match - Filter document within the range for createdAt field (if needed).

  2. $group - Group by query and perform $count as frequency.

  3. $project - Decorate the output document(s).

  4. $setWindowFields - With $rank to perform ranking by sorting frequency descending. May consider $denseRank for the document with the same rank.

db.collection.aggregate([
  // $match stage
  {
    $group: {
      _id: "$query",
      frequency: {
        $sum: 1
      }
    }
  },
  {
    $project: {
      _id: 0,
      query: "$_id",
      frequency: "$frequency"
    }
  },
  {
    $setWindowFields: {
      partitionBy: null,
      sortBy: {
        frequency: -1
      },
      output: {
        rank: {
          $rank: {}
        }
      }
    }
  },
  
])

Demo @ Mongo Playground

CodePudding user response:

You can write the following aggregation pipeline:

db.collection.aggregate([
  {
    "$group": {
      "_id": "$query",
      "frequency": {
        "$sum": 1
      }
    }
  },
  {
    "$project": {
      "query": "$_id",
      "frequency": 1,
      "_id": 0
    }
  },
  {
    "$sort": {
      frequency: -1
    }
  },
  {
    "$group": {
      "_id": null,
      "array": {
        "$push": "$$ROOT"
      }
    }
  },
  {
    "$unwind": {
      path: "$array",
      "includeArrayIndex": "rank"
    }
  },
  {
    "$project": {
      _id: 0,
      rank: {
        "$add": [
          "$rank",
          1
        ]
      },
      frequency: "$array.frequency",
      query: "$array.query"
    }
  }
]);

Playground link.

In this, we first calculate the frequency for each query, then we sort it by the frequency, and finally, we push all documents in an array and calculate the rank, using array index.

  • Related