So suppose I have a document like:
{
_id: 1,
items: ["aaa", "bbb", "ccc", "ddd", "eee"...]
}
I would like to shuffle the items list once, with this order saved in the table - i.e. I don't want to call random or something for every query, since there are about 200,000 items in this array (not huge, but still, calling $rand every time I want to retrieve an item would be inefficient)
So I'm really looking for some kind of manual script that I can run once - it would then update this document, so it became something like:
{
_id: 1,
items: ["ddd", "bbb", "aaa", "eee", "ccc"...]
}
If anyone knows if this is possible, I'd appreciate it. Thanks
Otherwise, I'd probably fetch the data, shuffle it using another language, then save it back into Mongo
CodePudding user response:
I'm not sure this is the better way to do this
hhttps://mongoplayground.net/p/4AH8buOXudQ
db.collection.aggregate([
{
$unwind: {
path: "$items"
}
},
{
$sample: {
size: 100 //to shuffle values upto particular index
}
},
{
$group: {
_id: "$_id",
item: {
$push: "$items"
}
}
}
]);
CodePudding user response:
If you're Mongo version 5.2 I would do this using an aggregation pipeline update with the new $sortArray operator and $rand
.
Essentially we add a random value for each item, sort the array and then transform it back, You can run this update on demand whenever you want to reshuffle the array.
db.collection.updateMany(
{},
[
{
$addFields: {
items: {
$map: {
input: {
$sortArray: {
input: {
$map: {
input: "$items",
in: {
value: "$$this",
sortVal: {
$rand: {}
}
}
}
},
sortBy: {
"sortVal": 1
}
}
},
in: "$$this.value"
}
}
}
}
])
If you're on a lesser version, you can generate some kind of pseudo random sort using $reduce
( you can actually do a bubble sort as well but that n^2 performance on such a large array is not recommend ), here is an example of how to generate some sort of randomness:
The approach is to iterate over the items array with the $reduce
operator, if the random generated value is less than 0.3 then we push the item to be in the start of the array, if that value is less than 0.6 we append it to the end of the new array and if that value is between 0.6 and 1 and push it in the middle of the array.
Obviously you can choose whatever random logic you want and add more switch cases, as mentioned even an actual sort is possible but at the cost of performance.
db.collection.update({},
[
{
$addFields: {
items: {
$map: {
input: {
$reduce: {
input: {
$map: {
input: "$items",
in: {
value: "$$this",
sortVal: {
$rand: {}
}
}
}
},
initialValue: [],
in: {
$switch: {
branches: [
{
case: {
$lt: [
"$$this.sortVal",
0.333
]
},
then: {
$concatArrays: [
"$$value",
[
"$$this"
]
]
},
},
{
case: {
$lt: [
"$$this.sortVal",
0.6666
]
},
then: {
$concatArrays: [
[
"$$this"
],
"$$value",
]
}
}
],
default: {
$concatArrays: [
{
$slice: [
"$$value",
{
$round: {
$divide: [
{
$size: "$$value"
},
2
]
}
}
]
},
[
"$$this"
],
{
$slice: [
"$$value",
{
$round: {
$divide: [
{
$size: "$$value"
},
2
]
}
},
{
$add: [
{
$size: "$$value"
},
1
]
}
]
}
]
}
}
}
}
},
in: "$$this.value"
}
}
}
}
])