I need to get two different sets of data from the same collection and if any data in any set has convergence with another data in other set combine them into one and if not convert them to a desired format, to make it more clear think there is a collection named collection1
and I need to get started tasks with specific condition and completed tasks with specific condition from this collection and I managed to do it like this:
db.collection1.aggregate([{
$facet: {
"started":[...],
"completed":[...]
}
},
]).toArray()
and assume that started is an array of this :
interface Starts{
supplier: string;
publisher: string;
partner: string;
buyer: string;
started_at: string;
starts: number;
}
And completed is an array of this:
interface Completes{
supplier: string;
publisher: string;
partner: string;
buyer: string;
finished_at: string;
completes: number;
revenue: number;
}
you see some started tasks with certain supplier
,partner
,buyer
,publisher
might be actually completed and their completion info exist in completed array so the two need to merge and some might not( and I will not have finished_at,completes,revenue info for them and I would put null for these values) and also some completed tasks
might not be related to any started tasks
( and I will not have starts,started_at info for them and I would put null for these values)ergo the final result will look like an array of this:
interface StartsCompletes{
supplier: string;
publisher: string;
partner: string;
buyer: string;
started_at: string;
finished_at: string;
starts: number;
completes: number;
revenue: number;
}
well I can loop through the output of facet
with JavaScript for-loops and make it happen but can I some how add another stage to my aggregate
pipeline and do it with MongoDB?
CodePudding user response:
I think it could be easier to achieve this in memory, but here is a way I use to merge arrays using $reduce
, essentially I iterate over the array item by item and either push it to the array if it doesn't match the condition (of supplier x partner x buyer x publisher ), and if it does exist I merge the two objects.
Overall it looks like this:
db.collection1.aggregate([
{
$facet: {
"started":[...],
"completed":[...]
}
},
{
$project: {
merged: {
$reduce: {
input: {
$concatArrays: [
"$started",
"$completed"
]
},
initialValue: {
values: [],
mapping: []
},
in: {
$cond: [
{
$in: [
{
$concat: [
"$$this.supplier",
"$$this.partner",
"$$this.buyer",
"$$this.publisher"
]
},
"$$value.mapping.key"
]
},
{
values: {
$map: {
input: "$$value.values",
as: "arrValue",
in: {
$cond: [
{
$and: [
{
$eq: [
"$$arrValue.supplier",
"$$this.supplier"
]
},
{
$eq: [
"$$arrValue.partner",
"$$this.partner"
]
},
{
$eq: [
"$$arrValue.buyer",
"$$this.buyer"
]
},
{
$eq: [
"$$arrValue.publisher",
"$$this.publisher"
]
},
]
},
{
$mergeObjects: [
"$$arrValue",
"$$this"
]
},
"$$arrValue"
]
}
}
},
mapping: "$$value.mapping"
},
{
values: {
$concatArrays: [
"$$value.values",
[
"$$this"
]
]
},
mapping: {
$concatArrays: [
"$$value.mapping",
[
{
key: {
$concat: [
"$$this.supplier",
"$$this.partner",
"$$this.buyer",
"$$this.publisher"
]
},
value: {
$size: "$$value.mapping"
}
}
]
]
}
}
]
}
}
}
}
},
{
$project: {
mergedResults: "$merged.values"
}
}
])
CodePudding user response:
Another option is to replace the $facet
stage with a simple $group
:
db.collection.aggregate([
{$group: {
_id: {
supplier: "$supplier",
publisher: "$publisher",
partner: "$partner",
buyer: "$buyer"
},
started_at: {$max: "$started_at"},
starts: {$max: "$starts"},
revenue: {$max: "$revenue"},
finished_at: {$max: "$finished_at"},
completes: {$max: "$completes"}
}},
{$set: {
buyer: "$_id.buyer",
partner: "$_id.partner",
publisher: "$_id.publisher",
supplier: "$_id.supplier",
_id: "$$REMOVE"
}}
])
Which should not worry about document size limit.
See how it works on the playground example