Home > Software design >  How can I compare and filter two fields in the same document in MongoDB?
How can I compare and filter two fields in the same document in MongoDB?

Time:10-05

My data is like this (I have taken out unnecessary fields and only left the dates in):

[
  {
    "date": "2022-09-25T12:35:51.833Z",
    "scans": [
      {
        "date": "2022-09-01T05:00:00.000Z",
      },
      {
        "date": "2022-08-04T05:00:00.000Z",
      },
      {
        "date": "2022-09-01T05:00:00.000Z",
      },
      {
        "date": "2022-09-06T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T12:55:12.018Z",
    "scans": [
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1926-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      },
      {
        "date": "1919-11-30T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T13:49:20.639Z",
    "scans": [
      {
        "date": "2022-09-15T05:00:00.000Z",
      },
      {
        "date": "2022-09-12T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T13:58:02.755Z",
    "scans": [
      {
        "date": "2022-09-13T05:00:00.000Z",
      },
      {
        "date": "2022-08-20T05:00:00.000Z",
      },
      {}
    ],
  },
  {
    "date": "2022-09-25T14:17:04.947Z",
    "scans": [
      {
        "date": "2022-09-12T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:17:49.489Z",
    "scans": [
      {
        "date": "2022-09-13T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:19:26.068Z",
    "scans": [{}],
  },
  {
    "date": "2022-09-25T14:20:07.569Z",
    "scans": [
      {
        "date": "2022-09-12T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:33:17.783Z",
    "scans": [
      {
        "date": "2022-08-15T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:33:41.050Z",
    "scans": [
      {
        "date": "2022-08-19T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T14:34:03.172Z",
    "scans": [
      {
        "date": "2022-09-07T07:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T15:28:23.723Z",
    "scans": [
      {
        "date": "2022-08-19T05:00:00.000Z",
      }
    ],
  },
  {
    "date": "2022-09-25T15:28:49.211Z",
    "scans": [
      {
        "date": "2022-09-09T05:00:00.000Z",
      }
    ],
  }
]

I would like to get back the same list of documents but filtered according to these:

  1. Include documents that have at least one scan where there is no date field.
  2. Include documents that have scans where the date is before 90 days of the date field of the document.
  3. Include documents that have scans where the date is after the date field of the document.

Bottom line is that I am trying to find documents with "incorrect" scans. How can this be achieved?

CodePudding user response:

You can try the following aggregation query:

db.collection.aggregate([
  {
    "$addFields": {
      "diffData": {
        "$map": {
          "input": "$scans",
          "as": "item",
          "in": {
            "$dateDiff": {
              "startDate": {
                "$toDate": "$$item.date"
              },
              "unit": "day",
              "endDate": {
                "$toDate": "$date"
              }
            }
          }
        }
      }
    }
  },
  {
    "$match": {
      "$or": [
        {
          diffData: {
            $elemMatch: {
              "$eq": null
            }
          }
        },
        {
          diffData: {
            $elemMatch: {
              "$gt": 90
            }
          }
        },
        {
          diffData: {
            $elemMatch: {
              "$lt": 0
            }
          }
        }
      ]
    }
  },
  {
    "$unset": "diffData"
  }
])

Playground link.

In this query, we create a new field called diffData, where we store the difference between the scan date and the root date in days. Finally, we filter the elements, based on the criteria specified.

  • Related