My data is like this (I have taken out unnecessary fields and only left the dates in):
[
{
"date": "2022-09-25T12:35:51.833Z",
"scans": [
{
"date": "2022-09-01T05:00:00.000Z",
},
{
"date": "2022-08-04T05:00:00.000Z",
},
{
"date": "2022-09-01T05:00:00.000Z",
},
{
"date": "2022-09-06T05:00:00.000Z",
}
],
},
{
"date": "2022-09-25T12:55:12.018Z",
"scans": [
{
"date": "1919-11-30T07:00:00.000Z",
},
{
"date": "1919-11-30T07:00:00.000Z",
},
{
"date": "1926-11-30T07:00:00.000Z",
},
{
"date": "1919-11-30T07:00:00.000Z",
},
{
"date": "1919-11-30T07:00:00.000Z",
},
{
"date": "1919-11-30T07:00:00.000Z",
},
{
"date": "1919-11-30T07:00:00.000Z",
},
{
"date": "1919-11-30T07:00:00.000Z",
}
],
},
{
"date": "2022-09-25T13:49:20.639Z",
"scans": [
{
"date": "2022-09-15T05:00:00.000Z",
},
{
"date": "2022-09-12T05:00:00.000Z",
}
],
},
{
"date": "2022-09-25T13:58:02.755Z",
"scans": [
{
"date": "2022-09-13T05:00:00.000Z",
},
{
"date": "2022-08-20T05:00:00.000Z",
},
{}
],
},
{
"date": "2022-09-25T14:17:04.947Z",
"scans": [
{
"date": "2022-09-12T05:00:00.000Z",
}
],
},
{
"date": "2022-09-25T14:17:49.489Z",
"scans": [
{
"date": "2022-09-13T05:00:00.000Z",
}
],
},
{
"date": "2022-09-25T14:19:26.068Z",
"scans": [{}],
},
{
"date": "2022-09-25T14:20:07.569Z",
"scans": [
{
"date": "2022-09-12T05:00:00.000Z",
}
],
},
{
"date": "2022-09-25T14:33:17.783Z",
"scans": [
{
"date": "2022-08-15T07:00:00.000Z",
}
],
},
{
"date": "2022-09-25T14:33:41.050Z",
"scans": [
{
"date": "2022-08-19T07:00:00.000Z",
}
],
},
{
"date": "2022-09-25T14:34:03.172Z",
"scans": [
{
"date": "2022-09-07T07:00:00.000Z",
}
],
},
{
"date": "2022-09-25T15:28:23.723Z",
"scans": [
{
"date": "2022-08-19T05:00:00.000Z",
}
],
},
{
"date": "2022-09-25T15:28:49.211Z",
"scans": [
{
"date": "2022-09-09T05:00:00.000Z",
}
],
}
]
I would like to get back the same list of documents but filtered according to these:
- Include documents that have at least one scan where there is no date field.
- Include documents that have scans where the date is before 90 days of the date field of the document.
- Include documents that have scans where the date is after the date field of the document.
Bottom line is that I am trying to find documents with "incorrect" scans. How can this be achieved?
CodePudding user response:
You can try the following aggregation query:
db.collection.aggregate([
{
"$addFields": {
"diffData": {
"$map": {
"input": "$scans",
"as": "item",
"in": {
"$dateDiff": {
"startDate": {
"$toDate": "$$item.date"
},
"unit": "day",
"endDate": {
"$toDate": "$date"
}
}
}
}
}
}
},
{
"$match": {
"$or": [
{
diffData: {
$elemMatch: {
"$eq": null
}
}
},
{
diffData: {
$elemMatch: {
"$gt": 90
}
}
},
{
diffData: {
$elemMatch: {
"$lt": 0
}
}
}
]
}
},
{
"$unset": "diffData"
}
])
In this query, we create a new field called diffData
, where we store the difference between the scan date and the root date in days. Finally, we filter the elements, based on the criteria specified.