I am trying drawing chart using MongoDB data, showing the minimum (floor) price of all transactions over a few days (green color line in the attached).
One problem is that, a few extreme transactions that sold at ridiculously low prices, made the sales floor green line going to zero.
If I filter the data using a query like this:
{"price": {$gte: 0.5}}
Green line will look more realistic:
But this only works because I know this particular chart the price should be around 10 to 20 so anything less than 0.5 is considered extreme.
In reality, I do not know what is the prices being feed into the chart, the price could be between 10 to 20, or between 0.01 to 0.02. So I cannot hard code 0.5.
Is there any MondoDB tricks, or special operators, or standard practise to filter away the extreme/outlier values?
CodePudding user response:
Use $bucketAuto to group your data by value.
In my case, I have 10 docs.
I separate them in 5 of 20% 20% 20% 20% 20%.
Then I remove the first 20% and the last 20%.
So the documents only keep the 60% in the middle.
Some how it ignore extreme values.
db.collection.aggregate([
{
$bucketAuto: {
groupBy: "$price",
buckets: 5,
output: {
docs: {
$push: "$$ROOT"
}
}
}
},
{
$skip: 1
},
{
$sort: {
"_id.max": -1
}
},
{
$skip: 1
},
{
$unwind: "$docs"
},
{
$replaceWith: "$docs"
}
])