Currently, I have 3 different aggregate queries which generate different counts based on grouping of companyRegNo. Is there a way to combine these 3 queries.
The employeeActions data looks like this:
{
"email": "[email protected]",
"companyRegNo" : 105,
"event" : {
"created" : ISODate("2022-09-16T06:42:42.761Z"),
"desc" : "COMPLETED_APPLICATIONS",
"note" : "Direct apply",
}
}
{
"email": "[email protected]",
"companyRegNo" : 247,
"event" : {
"created" : ISODate("2022-09-16T06:42:04.387Z"),
"desc" : "COMPLETED_APPLICATIONS",
"note" : "",
}
}
{
"email": "[email protected]",
"companyRegNo" : 247,
"event" : {
"created" : ISODate("2022-09-16T06:42:42.761Z"),
"desc" : "COMPLETED_REVIEW",
"note" : ""Sent for review"",
}
}
{
"email": "[email protected]",
"companyRegNo" : 105,
"event" : {
"created" : ISODate("2022-09-16T06:42:42.761Z"),
"desc" : "COMPLETED_REVIEW",
"note" : "Data is complete",
}
}
{
"email": "[email protected]",
"companyRegNo" : 247,
"event" : {
"created" : ISODate("2022-09-16T06:42:04.387Z"),
"desc" : "COMPLETED_OFFER",
"note" : "",
}
}
{
"email": "[email protected]",
"companyRegNo" : 227,
"event" : {
"created" : ISODate("2022-09-16T06:42:04.387Z"),
"desc" : "COMPLETED_APPLICATIONS",
"note" : "",
}
}
{
"email": "[email protected]",
"companyRegNo" : 227,
"event" : {
"created" : ISODate("2022-09-16T06:42:42.761Z"),
"desc" : "COMPLETED_APPLICATIONS",
"note" : "",
}
}
{
"email": "[email protected]",
"companyRegNo" : 105,
"event" : {
"created" : ISODate("2022-09-16T06:42:42.761Z"),
"desc" : "COMPLETED_APPLICATIONS",
"note" : "Direct apply",
}
}
The 3 different queries are:
Query for Review count
db.getCollection('employeeActions').aggregate([
{"$match": {
"event.created": {"$gte": ISODate("2022-06-01 00:00:00.000Z")},
"$or": [ {"event.desc": "COMPLETED_REVIEW"}, {"event.note": "Sent for review"}],
}},
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
{"$project":{ "companyRegNo":"$_id","count": "$count","_id":0}}
])
The result will be
105 1
227 0
247 1
Query for Offer count
db.getCollection('employeeActions').aggregate([
{"$match": {
"event.created": {"$gte": ISODate("2022-06-01 00:00:00.000Z")},
"event.desc": "COMPLETED_OFFER"
}},
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
{"$project":{ "companyRegNo":"$_id","count": "$count","_id":0}},
])
The result will be
105 0
227 0
247 1
Query for Applications count
db.getCollection('employeeActions').aggregate([
{"$match": {
"event.created": {"$gte": ISODate("2022-06-01 00:00:00.000Z")},
"event.desc": "COMPLETED_APPLICATIONS"
}},
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
{"$project":{ "companyRegNo":"$_id","count": "$count","_id":0}},
])
The result will be
105 2
227 2
247 1
Is there a way to combine these 3 queries so that it gives me the results as below?
CompanyRegNo Applications Reviews Offers
105 2 1 0
227 2 0 0
247 1 1 1
I tried using $facet to combine these queries. It gives me a combined 3 counts across all companies.
"offers" : 1, "reviews" : 2, "applications" : 5
I need the data to be split across companies.
db.getCollection('employeeActions').aggregate([
{ "$match": { "event.created": {"$gte": ISODate("2022-01-01 00:00:00.000Z")}}},
{ "$facet": {
"offers": [
{"$match":{ "event.desc": "COMPLETED_OFFER" } },
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 } } },
{ "$count": "offers" },
],
"reviews": [
{"$match":{
"$or": [ {"event.desc": "COMPLETED_REVIEW"}, {"event.note": "Sent for review"}],
}},
{"$group":{"_id":"$companyRegNo","count": {"$sum": 1 }} },
{ "$count": "reviews" },
],
"applications": [
{"$match":{ "event.desc": "COMPLETED_APPLICATIONS" } },
{"$group":{"_id":"$companyRegNo","count":{"$sum": 1 }} },
{ "$count": "applications" },
]
}
},
{ "$project": {
"offers": { "$arrayElemAt": ["$offers.offers", 0] },
"reviews": { "$arrayElemAt": ["$reviews.reviews", 0] },
"applications": { "$arrayElemAt": ["$applications.applications", 0] }
}}
])```
CodePudding user response:
Yes, you can use $facet stage to process multiple aggregation pipelines in the same time:
{
$facet: {
count_1: [ // Your first query ],
count_2: [ // Your second query ],
count_3: [ // Your third query ],
}
}
CodePudding user response:
While the answer by @NeNaD is considered as the "best practice" (and I vote for it), since $facet
was created for such cases, it is important to know, there are alternatives. There are disadvantages in using $facet
, one of them, is that it "merges" all your documents into one big document, and documents have a size limit. If your collection is large, and many documents will match your first condition, you may want to consider other methods.
One alternative is using $group
with $cond
:
db.collection.aggregate([
{$match: {
"event.created": {$gte: ISODate("2022-06-01T00:00:00.000Z")}
}},
{$group: {
_id: "$companyRegNo",
ReviewCount: {
$sum: {$cond: [
{$or: [
{$eq: ["$event.desc", "COMPLETED_REVIEW"]},
{$eq: ["$event.note", "Sent for review"]}
]
}, 1, 0]}
},
OfferCount: {
$sum: {$cond: [{$eq: ["$event.desc", "COMPLETED_OFFER"]}, 1, 0]}
},
ApplicationsCount: {
$sum: {$cond: [{$eq: ["$event.desc", "COMPLETED_APPLICATIONS"]}, 1, 0]}
}
}
}
])
See how it works on the playground example