Home > Blockchain >  Get all values of arrays from mongo documents
Get all values of arrays from mongo documents

Time:06-13

I have a mongo collection with documents containing arrays:

   { item: "journal", qty: 25, tags: ["blank", "red"], dim_cm: [ 14, 21 ] },
   { item: "notebook", qty: 50, tags: ["red", "blank"], dim_cm: [ 14, 21 ] },
   { item: "paper", qty: 100, tags: ["red", "blank", "plain"], dim_cm: [ 14, 21 ] },
   { item: "planner", qty: 75, tags: ["blank", "red"], dim_cm: [ 22.85, 30 ] },
   { item: "postcard", qty: 45, tags: ["blue"], dim_cm: [ 10, 15.25 ] }

I would like to get a single array containing all distinct values, such as:

tags: ["blank", "red", blue"] and dim_cm: [14,21,22.85,30,10,15.25]

Is this possible with an aggregation pipeline?

CodePudding user response:

You can use $group with $reduce and $setIntersection:

  1. $group all documents to create one array of arrays per key
  2. flatten each array with $reduce and make it a set using $setIntersection.
db.collection.aggregate([
  {$group: {_id: null, tags: {$push: "$tags"}, dim_cm: {$push: "$dim_cm"}}},
  {
    $project: {
      _id: 0,
      tags: {
        $setIntersection: [
          {$reduce: {
              input: "$tags",
              initialValue: [],
              in: {$concatArrays: ["$$value", "$$this"]}
            }
          }
        ]
      },
      dim_cm: {
        $setIntersection: [
          {$reduce: {
              input: "$dim_cm",
              initialValue: [],
              in: {$concatArrays: ["$$value", "$$this"]}
            }
          }
        ]
      }
    }
  }
])

See how it works on the playground example

Another way is:

db.collection.aggregate([
  {$unwind: "$tags"},
  {
    $group: {
      _id: null,
      tags: {$addToSet: "$tags"},
      dim_cm: {$addToSet: "$dim_cm"
    }
  },
  {$unwind: "$dim_cm"},
  {$unwind: "$dim_cm"},
  {
    $group: {
      _id: null,
      tags: {$first: "$tags"},
      dim_cm: {$addToSet: "$dim_cm"}
    }
  }
])

Playground - unwind

Which you can split into two queries which will be much faster:

db.collection.aggregate([
  {$unwind: "$tags"},
  {
    $group: {
      _id: null,
      tags: {$addToSet: "$tags"}
    }
  },
])

A 3rd option is:

db.collection.aggregate([
  {
    $project: {
      _id: 0,
      arr: {
        $concatArrays: [
          {$map: {input: "$tags", as: "item", in: {k: "tag",  v: "$$item"}}},
          {$map: {input: "$dim_cm", as: "item", in: {k: "dim_cm", v: "$$item"}}}
        ]
      }
    }
  },
  {$unwind: "$arr"},
  {
    $group: {
      _id: null,
      tags: {
        $addToSet: {$cond: [{$eq: ["$arr.k", "tag"]}, "$arr.v", "$$REMOVE"]}
      },
      dim_cm: {
        $addToSet: {$cond: [{$eq: ["$arr.k", "dim_cm"]}, "$arr.v", "$$REMOVE"]}
      }
    }
  }
])

Playground 3rd

CodePudding user response:

Query

  • put both in one array
  • unwind
  • at group time check the type(string or not) and put to the right group

*here types are different, in case they were the same type, we can do another trick like put the in an array of pairs [[tag,cm] ...] where first would be the tag and the second would be the cm, or array of documents
For perfomance if you test it send how it went if you can

Playmongo

aggregate(
[{"$project": {"tags-dim": {"$concatArrays": ["$tags", "$dim_cm"]}}},
 {"$unwind": "$tags-dim"},
 {"$group": 
   {"_id": null,
    "tags": 
     {"$addToSet": 
       {"$cond": 
         [{"$eq": [{"$type": "$tags-dim"}, "string"]}, "$tags-dim",
          "$$REMOVE"]}},
    "dim_cm": 
     {"$addToSet": 
       {"$cond": 
         [{"$eq": [{"$type": "$tags-dim"}, "string"]}, "$$REMOVE",
          "$tags-dim"]}}}}])
  • Related