I have records like this
[
{
"key": 1,
"val":"1"
},
{
"key": 2,
"val":"1"
},
{
"key": 3,
"val":"1"
},
{
"key": 4,,
"val":"2"
}
]
I want to active something like this,
if i give a filter {val:"1"}
i want to be able get 50%(rounded up) random results.
for ex, one possible expected results could be
{
"key": 1,
"val":"1"
},
{
"key": 2,
"val":"1"
}
I could have handled this in my code after getting all results, but this is one of the steps in my aggregate pipeline, output of this stage will be used as input for next stages of the pipeline.
CodePudding user response:
You can first create a random sort key by $rand
. Use $setWindowFields
to compute $rank
and total. Finally use $divide
to compute the relative rank and find by less than the threshold / sampling percentage you preferred.
db.collection.aggregate([
{
"$addFields": {
"randSortKey": {
"$rand": {}
}
}
},
{
"$setWindowFields": {
"partitionBy": null,
"sortBy": {
"randSortKey": 1
},
"output": {
"rank": {
"$rank": {}
},
total: {
$sum: 1
}
}
}
},
{
"$match": {
$expr: {
$lte: [
{
"$divide": [
"$rank",
"$total"
]
},
0.5
]
}
}
},
{
"$unset": [
"randSortKey",
"rank",
"total"
]
}
])