[
{_id: 1, query: 'A', createdAt: 1660610671 },
{_id: 2, query: 'A', createdAt: 1660610672 },
{_id: 3, query: 'A', createdAt: 1660610673 },
{_id: 4, query: 'A', createdAt: 1660610674 },
{_id: 5, query: 'B', createdAt: 1660610675 },
{_id: 6, query: 'C', createdAt: 1660610676 },
{_id: 7, query: 'C', createdAt: 1660610677 },
{_id: 8, query: 'C', createdAt: 1660610678 },
{_id: 9, query: 'D', createdAt: 1660610680 },
{_id: 10, query: 'D', createdAt: 1660610681 },
]
I have the above database structure. I want to get rank from the frequency of the query
value in a specific period.
Maybe it would be something like this.
Queries.getRank({ key: 'query', createdAt: {$gte: startUnix, $lt: endUnix } })
I expect the result as below.
Rank
[
{rank: 1, query: 'A', frequency: 4},
{rank: 2, query: 'C', frequency: 3},
{rank: 3, query: 'D', frequency: 2},
{rank: 4, query: 'B', frequency: 1}
]
Is there a way to achieve it? Thanks.
CodePudding user response:
$match
- Filter document within the range forcreatedAt
field (if needed).$group
- Group byquery
and perform$count
asfrequency
.$project
- Decorate the output document(s).$setWindowFields
- With$rank
to perform ranking by sortingfrequency
descending. May consider$denseRank
for the document with the same rank.
db.collection.aggregate([
// $match stage
{
$group: {
_id: "$query",
frequency: {
$sum: 1
}
}
},
{
$project: {
_id: 0,
query: "$_id",
frequency: "$frequency"
}
},
{
$setWindowFields: {
partitionBy: null,
sortBy: {
frequency: -1
},
output: {
rank: {
$rank: {}
}
}
}
},
])
CodePudding user response:
You can write the following aggregation pipeline:
db.collection.aggregate([
{
"$group": {
"_id": "$query",
"frequency": {
"$sum": 1
}
}
},
{
"$project": {
"query": "$_id",
"frequency": 1,
"_id": 0
}
},
{
"$sort": {
frequency: -1
}
},
{
"$group": {
"_id": null,
"array": {
"$push": "$$ROOT"
}
}
},
{
"$unwind": {
path: "$array",
"includeArrayIndex": "rank"
}
},
{
"$project": {
_id: 0,
rank: {
"$add": [
"$rank",
1
]
},
frequency: "$array.frequency",
query: "$array.query"
}
}
]);
In this, we first calculate the frequency for each query, then we sort it by the frequency, and finally, we push all documents in an array and calculate the rank, using array index.