I need the find matching pairs in a single collection based on the values of fields and update them.
Let's say I have the following collection:
[{ "_id": "a", "client": 1, "tag": "", "debit": 12, "credit": -20 },
{ "_id": "b", "client": 1, "tag": "G", "debit": -12, "credit": 20 } ,
{ "_id": "c", "client": 1, "tag": "G", "debit": 12, "credit": 20 },
{ "_id": "d", "client": 2, "tag": "", "debit": 13, "credit": 0 },
{ "_id": "e", "client": 2, "tag": "G", "debit": -13, "credit": 0 },
{ "_id": "f", "client": 3, "tag": "", "debit": 13, "credit": 0 },
{ "_id": "g", "client": 3, "tag": "G", "debit": 14, "credit": 0 },
{ "_id": "h", "client": 4, "tag": "", "debit": 0, "credit": 0 }]
Now for two documents to build a pair they have to:
- have the same client
- one has the "" tag, one has the "G" tag
- one has to have the inverse value of debit OR credit
- bonus: ignore 0 if possible (since -0 == 0, I could also just remove the field)
- bonus: if multiple pairs with "G" are found only take one of the two
So the example collection would result in the following pairs:
- ("a", "b") or ("a", "c"):
- a.client == b.client && a.tag == "" && b.tag == "G" && a.debit == -b.debit
- for "c": a.credit == -c.credit
- ("d", "e"):
- normal pair: d.client == e.client && d.tag == "" && e.tag == "G" && d.debit == -e.debit
- ("f", "g") DON'T match, because 0 doesn't count
- "h" has no match
The result could look something like this:
[{ "pair": ["a", "b"] }, { "pair": ["d", "e"] }]
The goal is to update all documents who appear in any pair, so this could also just be a list of ids.
Final goal:
[{ "_id": "a", "client": 1, "tag": "", "debit": 12, "credit": -20, "s": true },
{ "_id": "b", "client": 1, "tag": "G", "debit": -12, "credit": 20, "s": true } ,
{ "_id": "c", "client": 1, "tag": "G", "debit": 12, "credit": 20 },
{ "_id": "d", "client": 2, "tag": "", "debit": 13, "credit": 0, "s": true },
{ "_id": "e", "client": 2, "tag": "G", "debit": -13, "credit": 0, "s": true },
{ "_id": "f", "client": 3, "tag": "", "debit": 13, "credit": 0 },
{ "_id": "g", "client": 3, "tag": "G", "debit": 14, "credit": 0 },
{ "_id": "h", "client": 4, "tag": "", "debit": 0, "credit": 0 }]
I know I can group by client but how do I group by non matching fields?
And how can I compare the values of two documents, since I have to compare each documents with all others..
In SQL I solved it with this statement, maybe this helps or makes it clearer (I know it doesn't filter duplicates, the one with duplicates is too long to get the point across):
SELECT d1.id, d2.id
FROM documents d1, documents d2
WHERE d1.client = d2.client
AND d1.tag = "" AND d2.tag = "G"
AND (d1.debit = -d2.debit OR d1.credit = -d2.credit)
Thanks :)
CodePudding user response:
I hope you get a more succinct answer, but here's one way to update (using "$merge"
) the documents in the collection using your logic to identify them as pairs.
db.documents.aggregate([
{
"$match": {
"tag": ""
}
},
{
"$set": {
"docStore": "$$ROOT"
}
},
{
"$lookup": {
"from": "documents",
"localField": "client",
"foreignField": "client",
"let": {
"credit": "$credit",
"debit": "$debit",
"tag": "$tag"
},
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{"$eq": ["$tag", "G"]},
{
"$or": [
{
"$and": [
{"$ne": ["$$credit", 0]},
{"$eq": [{"$add": ["$$credit", "$credit"]}, 0]}
]
},
{
"$and": [
{"$ne": ["$$debit", 0]},
{"$eq": [{"$add": ["$$debit", "$debit"]}, 0]}
]
}
]
}
]
}
}
}
],
"as": "pairs"
}
},
{
"$match": {
"$expr": {
"$gt": [{"$size": "$pairs"}, 0]
}
}
},
{
"$set": {
"pair": [
"$docStore",
{"$first": "$pairs"}
]
}
},
{
"$project": {
"_id": 0,
"pair": 1
}
},
{"$unwind": "$pair"},
{
"$replaceWith": {
"$mergeObjects": [
"$pair",
{"s": true}
]
}
},
{"$merge": "documents"}
])
Try it on mongoplayground.net.