My collection looks something like this:
{
{
_id: 'some value',
'product/productId': 'some value',
'product/title': 'some value',
'product/price': 'unknown'
},
{
_id: 'some value',
'product/productId': 'some value',
'product/title': 'some value',
'product/price': '12.57'
}
}
My goal is to find if there are any products that have more than one price. Values of the key "product/price" can be "unknown" or numerical (e.g. "12.75"). Is there a way to write an aggregation pipeline for that or do I need to use a map-reduce algorithm? I tried both options but didn't find the solution.
CodePudding user response:
If I've understood correctly you can try this aggregation pipeline:
First of all, the _id
field is (or should be) unique, so I think you mean another field like id
.
So the trick is to group by that id
and get all prices into an array. Then filter using $match
to get only documents where the total of prices is greater than 1.
db.collection.aggregate([
{
"$group": {
"_id": "$id",
"price": {
"$push": "$product/price"
}
}
},
{
"$match": {
"$expr": {
"$gt": [ { "$size": "$price" }, 1 ]
}
}
}
])
Example here
As added into comments for Joe if you want consider identical values as the same you have to use $addToSet
Example here