I have a collection and I want to get the total number of documents and use that amount in a $facet to calculate the 95th percentile (in the example below, the duration for each document is defined as finish_time - start_time).
Let's say I have these documents:
[
{
"_id": ObjectId("178768747638364736373637"),
"start_time": ISODate("2019-02-03T12:00:00.000Z"),
"finish_time": ISODate("2019-02-03T12:01:00.000Z")
},
{
"_id": ObjectId("266747364736363536353555"),
"start_time": ISODate("2019-02-03T12:00:00.000Z"),
"finish_time": ISODate("2019-02-03T12:03:00.000Z")
},
{
"_id": ObjectId("367463536453623546353625"),
"start_time": ISODate("2019-02-03T12:00:00.000Z"),
"finish_time": ISODate("2019-02-03T12:08:00.000Z")
}
]
I'm expecting this output:
[
{
"Percentile95Index": 2.8499999999999996,
"_id": ObjectId("178768747638364736373637"),
"duration": 60,
"totalCount": 3
},
{
"Percentile95Index": 2.8499999999999996,
"_id": ObjectId("266747364736363536353555"),
"duration": 180,
"totalCount": 3
},
{
"Percentile95Index": 2.8499999999999996,
"_id": ObjectId("367463536453623546353625"),
"duration": 480,
"totalCount": 3
}
]
So I did this (https://mongoplayground.net/p/xtHLHouzNQr):
db.collection.aggregate([
{
// Since total count will always be a one-document result, I need $facet to run multiple aggregation pipelines and then merge results.
$facet: {
totalCount: [
{
$count: "value"
}
],
pipelineResults: [
{
"$project": {
duration: {
"$divide": [
{
"$subtract": [
"$finish_time",
"$start_time"
]
},
1000
]// duration is in seconds
},
Percentile95Index: {
"$multiply": [
0.95,
"$totalCount.value" // HERE I'D LIKE TO ACCESS THE TOTAL_COUNT!!!!
]
}
}
}
]
}
},
{
$unwind: "$totalCount"
},
{
$unwind: "$pipelineResults"
},
{
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$pipelineResults",
{
totalCount: "$totalCount.value"
}
]
}
}
}
])
As you can see, inside the Percentile95Index field, I'd like to access the totalCount.value but it's not yet accessible there... and I can't figure out how to do...
Any help?
CodePudding user response:
For my opinion, I don't think the $facet
stage is needed based on your scenario. (Would be great if you can share the reason to use $facet
).
To count the total documents in the collection and render the count value to each document, you may look for $setWindowFields
.
$setWindowFields
- Render thetotalCount
field with the value of the total number of documents in the collection.$project
- Decorate the output documents.
db.collection.aggregate([
{
$setWindowFields: {
output: {
totalCount: {
$sum: 1
}
}
}
},
{
"$project": {
totalCount: 1,
duration: {
"$divide": [
{
"$subtract": [
"$finish_time",
"$start_time"
]
},
1000
]
},
Percentile95Index: {
"$multiply": [
0.95,
"$totalCount"
]
}
}
}
])