Home > other >  Mongo Query to get duplicate entry from sub document array
Mongo Query to get duplicate entry from sub document array

Time:04-12

The objective here is to find the duplicate records in the sub document based on the condition and return the output as mentioned below.

Dataset

[{

    _id: "objectId",
    name: "product_a",
    array: [{
        _id: "objectId",
        start: "2022-01-01 00:00:00.000Z",
        end: "2022-01-30 00:00:00.000Z",
        status: "active",
        person: "A" //reference objectId
    },
    {
        _id: "objectId",
        start: "2022-03-01 00:00:00.000Z",
        end: null,
        status: "active",
        person: "A"
    },
    {
        _id: "objectId",
        start: "2022-03-01 00:00:00.000Z",
        end: null,
        status: "active",
        person: "A"
    },
    {
        _id: "objectId",
        start: "2022-02-01 00:00:00.000Z",
        end: null,
        status: "active",
        person: "B"
    }]
},
{

    _id: "objectId",
    name: "product_b",
    array: [{
        _id: "objectId",
        start: "2021-12-30 00:00:00.000Z",
        end: "2022-01-30 00:00:00.000Z",
        status: "active",
        person: "C"
    },
    {
        _id: "objectId",
        start: "2022-03-01 00:00:00.000Z",
        end: null,
        status: "active",
        person: "C"
    },
    {
        _id: "objectId",
        start: "2022-03-01 00:00:00.000Z",
        end: null,
        status: "active",
        person: "C"
    },
    {
        _id: "objectId",
        start: "2022-03-20 00:00:00.000Z",
        end: null,
        status: "active",
        person: "D"
    }]
}]

Expected output

[
    {
        _id: "objectId",
        name: "product_a",
        targetIds: ["A"]
    },
    {
        _id: "objectId",
        name: "product_b",
        targetIds: ["C"]
    }
]

I'm trying to fetch the duplicate person value as array (targetIds) from each document where the person has two active records with end as null in the sub document. Below is my query that I have tried

connectCollection.aggregate([
    {
        $unwind: "$array"
    },
    {
        "$match": {
            $and: [
                {
                    "array.status": {
                        "$exists": true,
                        "$eq": "active"
                    }
                },
                {
                    "array.end": {
                        "$exists": true,
                        "$eq": null
                    }
                }
            ]
        }
    },
    {
        $group: {
            _id: {
                product_id: "$_id",
                product_name: "$name",
                targetIds: "$array.person"
            },
            count: {
                $sum: 1
            }
        }
    },
    {
        $match: {
            count: {
                $gt: 1
            }
        }
    },
    { $sort: { _id: 1 } }
])

CodePudding user response:

All you were missing is one more step at the end:

{
    $group: {
      _id: "$_id.product_id",
      name: {
        $first: "$_id.product_name"
      },
      targetIds: {
        $push: "$_id.targetIds"
      }
    }
  }

And change the second array of the sample data from user to person, as your query (and @Yong Shun) stated. You can see it works here with some small changes in the input data, to demonstrate the solution

  • Related