Home > Software design >  How to remove duplicate values inside an array in MONGODB?
How to remove duplicate values inside an array in MONGODB?

Time:10-05

Individuals schema:

[
    {
      "_id": "e6f32800-240e-11ec-b291-51abbaa8f015",
      "firstName": "A",
      "lastName": "Cris",
      "phoneNumber": "111-222-3333",
      "socialMedia": "FaceBook",
      "DOB": "01/01/1990",
      "Theater": 1,
      "__v": 0
    },
    {
      "_id": "e7092b90-240f-11ec-8812-d375202a89ac",
      "firstName": "A",
      "lastName": "Cris",
      "phoneNumber": "111-222-3333",
      "socialMedia": "FaceBook",
      "DOB": "01/01/1990",
      "Theater": 2,
      "__v": 0
    },
    {
      "_id": "e8e78880-240f-11ec-8812-d375202a89ac",
      "firstName": "A",
      "lastName": "Cris",
      "phoneNumber": "111-222-3333",
      "socialMedia": "Twitter",
      "DOB": "01/01/1990",
      "Theater": 3,
      "__v": 0
    },
    {
      "_id": "ee20f750-240f-11ec-8812-d375202a89ac",
      "firstName": "B",
      "lastName": "Hood",
      "phoneNumber": "333-444-5555",
      "socialMedia": "Friends",
      "DOB": "05/05/1993",
      "Theater": 1,
      "__v": 0
    },
    {
      "_id": "76d6ad60-2410-11ec-bc7f-8dbab8f2c871",
      "firstName": "B",
      "lastName": "Hood",
      "phoneNumber": "333-444-5555",
      "socialMedia": "Radio 900 AM",
      "DOB": "05/05/1993",
      "Theater": 2,
      "__v": 0
    },
    {
      "_id": "f053b5d0-240f-11ec-8812-d375202a89ac",
      "firstName": "B",
      "lastName": "Hood",
      "phoneNumber": "333-444-5555",
      "socialMedia": "Radio 900 AM",
      "DOB": "05/05/1993",
      "Theater": 3,
      "__v": 0
    },
    {
      "_id": "79946dd0-2410-11ec-bc7f-8dbab8f2c871",
      "firstName": "C",
      "lastName": "Mohammad",
      "phoneNumber": "555-666-7777",
      "socialMedia": "Radio 104.2 PM",
      "DOB": "10/10/1995",
      "Theater": 1,
      "__v": 0
    },
    {
      "_id": "7b4244e0-2410-11ec-bc7f-8dbab8f2c871",
      "firstName": "C",
      "lastName": "Mohammad",
      "phoneNumber": "555-666-7777",
      "socialMedia": "News",
      "DOB": "10/10/1995",
      "Theater": 2,
      "__v": 0
    },
    {
      "_id": "7d097050-2410-11ec-bc7f-8dbab8f2c871",
      "firstName": "C",
      "lastName": "Mohammad",
      "phoneNumber": "555-666-7777",
      "socialMedia": "News",
      "DOB": "10/10/1995",
      "Theater": 3,
      "__v": 0
    }
]

MovieTheater schema:

[
    {
      "_id": 1,
      "TheaterName": "AMC Katy Mill 20",
      "location": "Katy, TX",
      "__v": 0
    },
    {
      "_id": 2,
      "TheaterName": "AMC First Colony",
      "location": "Sugar Land, TX",
      "__v": 0
    },
    {
      "_id": 3,
      "TheaterName": "AMC Deerbrook",
      "location": "Humble, TX",
      "__v": 0
    }
]

Noted:

  1. Theater field in Individuals schema is _id identifier of MovieTheater schema.
  2. socialMedia field is the way individuals know about MovieTheater

I can do how many times (count) individuals joined the movie theater, push all socialMedia into an array, projection by using:

db.collection.aggregate([
  {
    "$group": {
      "_id": {
        "firstName": "$firstName",
        "lastName": "$lastName",
        "phoneNumber": "$phoneNumber",
        "DOB": "$DOB"
      },
      "count": {
        "$sum": 1
      },
      "socialMedia": {
        "$push": "$socialMedia"
      }
    }
  },
 {
    "$project": {
        "_id.firstName": 1,
        "_id.lastName": 1,
        "Count": 1,
        "socialMedia": 1
    }
 }
])

Output:

[
  {
    "_id": {
      "firstName": "A",
      "lastName": "Cris"
    },
    "Count": 3,
    "socialMedia": [
      "FaceBook",
      "FaceBook",
      "Twitter"
    ]
  },
  {
    "_id": {
      "firstName": "B",
      "lastName": "Hood"
    },
    "Count": 3,
    "socialMedia": [
      "Radio 900 AM",
      "Radio 900 AM",
      "Friends"
    ]
  },
  {
    "_id": {
      "firstName": "C",
      "lastName": "Mohammad"
    },
    "Count": 3,
    "socialMedia": [
      "News",
      "News",
      "Radio 104.2 PM"
    ]
  }
]

But I don't know to remove duplicate values in socialMedia.

My expected result looks like this:

{
    "_id": {
      "firstName": "A",
      "lastName": "Cris"
    },
    "Count": 3,
    "socialMedia": [
      "FaceBook",
      "Twitter"
    ]
}

Thank you so much.

CodePudding user response:

Use $addToSet instead of $push for socialMedia.

$addToSet

$addToSet returns an array of all unique values that results from applying an expression to each document in a group.

db.collection.aggregate([
  {
    "$group": {
      "_id": {
        "firstName": "$firstName",
        "lastName": "$lastName",
        "phoneNumber": "$phoneNumber",
        "DOB": "$DOB"
      },
      "count": {
        "$sum": 1
      },
      "socialMedia": {
        "$addToSet": "$socialMedia"
      }
    }
  },
  {
    "$project": {
      "_id.firstName": 1,
      "_id.lastName": 1,
      "Count": 1,
      "socialMedia": 1
    }
  }
])

Sample Mongo Playground

  • Related