Assuming a setup like this:
stores
{
name: "store1",
category: "category1"
},
{
name: "store2",
category: "category2"
}
products
{
store_name: "store1",
name: "product1"
},
{
store_name: "store2",
name: "product2"
}
with hundreds of millions of records in each collection.
I need to query all product names for stores with category != some_parameter. The example above for category category1 should return product2 because store2 has category2 (not category1).
I can't change the DB collections. Is it possible to use $lookup stage within an aggregation pipeline to perform "NOT IN" against another collection? something like: find all products with store not in (find all stores with category = category_param)
CodePudding user response:
You can do it like this:
$lookup
- to fetch store information for each product.$set
with$first
- Since first step will return an array that will always have one item, we will take that item.$match
with$ne
- to filter for all the documents where category is not equal to the requested category.
db.products.aggregate([
{
"$lookup": {
"from": "stores",
"localField": "store_name",
"foreignField": "name",
"as": "store"
}
},
{
"$set": {
"store": {
"$first": "$store"
}
}
},
{
"$match": {
"store.category": {
"$ne": "category1"
}
}
}
])