I have a large collection of JSON documents that has many entries of the following format (contents in my example are important to my question):
doc1
{
"data": [{
"entry": {
"fieldA": "aaa",
"fieldB": "xxx"
}
},
{
"entry": {
"fieldA": "ccc",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "eee",
"fieldB": "xxx"
}
}
]
}
doc2
{
"data": [{
"entry": {
"fieldA": "aaa",
"fieldB": "xxx"
}
},
{
"entry": {
"fieldA": "ccc",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "eee",
"fieldB": "nnn"
}
}
]
}
...
docN
{
"data": [{
"entry": {
"fieldA": "aaa",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "ccc",
"fieldB": "yyy"
}
},
{
"entry": {
"fieldA": "eee",
"fieldB": "xxx"
}
}
]
}
What I want to do is create a query that follows the below rule:
Only returns documents where it has a fieldA
that contains aaa
and has another entry where fieldA
contains eee
AND where the fieldB
of those entries have values that match.
In the above example, that would be the first top level document as the fieldB
for both sub entries are xxx
Additionally it would be nice to have just the entries pruned in the returned document, instead of the whole document
CodePudding user response:
I hpoe this will return as you expected
[{
$unwind: {
path: '$data'
}
}, {
$match: {
$or: [
{
'data.entry.fieldA': 'aaa'
},
{
'data.entry.fieldA': 'eee'
}
]
}
}, {
$group: {
_id: '$_id',
data: {
$push: {
fieldA: '$data.entry.fieldA',
fieldB: '$data.entry.fieldB'
}
}
}
}]
CodePudding user response:
One option without unwinding and grouping again is:
- Clean your data as requested.
$match
only wanted documents
db.collection.aggregate([
{$project: {
data: {$filter: {
input: "$data",
cond: {$or: [
{$eq: ["$$this.entry.fieldA", "aaa"]},
{$eq: ["$$this.entry.fieldA", "eee"]}
]}
}}
}},
{$match: {
$expr: {$and: [
{$eq: [{$size: "$data"}, 2]},
{$eq: [{$first: "$data.entry.fieldB"}, {$last: "$data.entry.fieldB"}]}
]}
}}
])
See how it works on the playground example