Home > Software design >  Extract unique values from Array in aggregation pipeline
Extract unique values from Array in aggregation pipeline

Time:12-16

In my aggregation pipeline, after some previous aggregation, I end up with documents similar to these:

[
  {
    "_id": 0,
    "group": "Electronics",
    // other fields omitted for brevity
    "articles": [
      {
        "name": "Gameboy",
        // Even more fields omitted for brevity
        "area": "Video Games"
      },
      {
        "name": "Playstation",
        "area": "Video Games"
      },
      {
        "name": "Refrigerator",
        "area": "White Goods"
      }
    ]
  },
  {
    "_id": 1,
    "group": "Food",
    "articles": [
      {
        "name": "Apple",
        "area": "Fruit"
      },
      {
        "name": "Pear",
        "area": "Fruit"
      }
    ]
  }
]

I need to extract the unique area values from the arrays, while keeping the rest of the document intact (the articles aren't needed afterwards, though). The result should look like this:


[
    {
        "_id": 0,
        "group": "Electronics",
        // other fields...
        "articleAreas": [ "Video Games", "White Goods" ]
    },
    {
        "_id": 1,
        "group": "Food",
        "articleAreas": [ "Fruit" ]
    }
]

My gut feeling is that there should be some way to use $addToSet similar to what can be done in $group stages, but I couldn't figure out how.

CodePudding user response:

You can try $addFields stage and $setUnion operator to get unique values from array,

  • $addFields to add new field articleAreas
  • $setUnion to get unique values form array of values from articles.area
db.collection.aggregate([
  {
    $addFields: {
      articleAreas: {
        $setUnion: "$articles.area"
      }
    }
  }
])

Playground

  • Related