Home > Mobile >  Modifying the Aggregation Function to get a reformatted result
Modifying the Aggregation Function to get a reformatted result

Time:03-30

I have a dataset like this in mongodb

[
  {
    "task_id": "as4d2rds5",
    "url": "https:example1.com",
    "organization": "Avengers",
    "val": "null"
  },
  {
    "task_id": "rfre43fed",
    "url": "https:example1.com",
    "organization": "Avengers",
    "val": "valid"
  },
  {
    "task_id": "uyje3dsxs",
    "url": "https:example2.com",
    "organization": "Metro",
    "val": "valid"
  },
  {
    "task_id": "ghs563vt6",
    "url": "https:example1.com",
    "organization": "Avengers",
    "val": "invalid"
  },
  {
    "task_id": "erf6egy64",
    "url": "https:example2.com",
    "organization": "Metro",
    "val": "null"
  }
]

I am trying to create an mongodb aggregate function so that it will yield a result like

[
  {
    "Metro": {
      "invalid": 0,
      "null": 1,
      "valid": 1,
      "url": "https:example2.com"
    },
  },
  {
    "Avengers": {
      "invalid": 1,
      "null": 1,
      "valid": 1,
      "url": "https:example1.com"
    }
  }
]

I got a great deal of help from stackoverflow to reach here.

I need to reformat the data received from an aggregator so that it produces the above result. Present aggregation script is

db.collection.aggregate([  {"$group": {"_id": {"k": "$organization","v": "$val"},"cnt": {"$sum": 1},"url": {$first: "$url"}}},
  {"$project": {"_id": 0, "url": 1, "k": "$_id.k", "o": {"k": "$_id.v", "v": "$cnt"}}},
  {"$group": {"_id": "$k", "v": { "$push": "$o"}, "url": {"$first": "$url"}}},
  {"$addFields": {"v": {"$mergeObjects": [{"null": 0,"valid": 0,"invalid": 0},{"$arrayToObject": "$v"}]}}},
  {"$project": {"_id": 0, "url": 1, "new": [{"k": "$_id","v": "$v"}]}},
  {"$addFields": {"new": {"$mergeObjects": [{"$arrayToObject": "$new"},{"url": "$url"}]}}},
  {"$replaceRoot": {"newRoot": "$new"}} ])

CodePudding user response:

You can try this query to avoid multiple $group (as a trade off you have three $filter but I think this is still better than a multiple $group):

This query group by organization and then use $project to output the size of how many "valid", "invalid" and "null" exists.

Edit: Also you can add an extra step $replaceRoot to get exactly the same output as you want.

db.collection.aggregate([
  {
    "$group": {
      "_id": "$organization",
      "val": {
        "$push": "$val"
      },
      "url": {
        "$first": "$url"
      }
    }
  },
  {
    "$project": {
      "_id": 0,
      "organization": [
        {
          "k": "$_id",
          "v": {
            "url": "$url",
            "invalid": {
              "$size": {
                "$filter": {
                  "input": "$val",
                  "cond": {
                    "$eq": [
                      "$$this",
                      "invalid"
                    ]
                  }
                }
              }
            },
            "valid": {
              "$size": {
                "$filter": {
                  "input": "$val",
                  "cond": {
                    "$eq": [
                      "$$this",
                      "valid"
                    ]
                  }
                }
              }
            },
            "null": {
              "$size": {
                "$filter": {
                  "input": "$val",
                  "cond": {
                    "$eq": [
                      "$$this",
                      "null"
                    ]
                  }
                }
              }
            }
          }
        }
      ]
    }
  },
  {
    "$replaceRoot": {
      "newRoot": {
        "$arrayToObject": "$organization"
      }
    }
  }
])

Example here

  • Related