Home > Mobile >  MongoDB - Nested aggregation with sort and limit at each stage
MongoDB - Nested aggregation with sort and limit at each stage

Time:09-12

My document looks like this:

{
  Region: "Europe",
  Country: "Luxembourg",
  Category: "Snacks",
  Sales Channel: "Offline",
  Order Priority: "C",
  Units Sold: 9357,
  Unit Price: 421.89,
  Unit Cost: 364.69,
  Total Revenue: 3947624.73,
  Total Cost: 3412404.33,
  Total Profit: 535220.4,
}

Basically, there are multiple countries with the same Region and multiple categories with the same country.

I want to run an aggregation query which returns top n Regions with their top x Countries with top y categories by Revenue. So the response should look something like this or similar:

{
  "Region": "Asia",
  "Total": 12345,
  "Countries": [{
    "country": "China",
    "Total Revenue": 1234,
    "Categories": {
      "Snacks": {
        "Revenue": 123
      },
      "Cosmetics": {
        "Revenue": 123
      }
    }
  }, {
    "country": "China",
    "Total": 1234,
    "Categories": {
      "Snacks": {
        "Revenue": 123
      },
      "Cosmetics": {
        "Revenue": 123
      }
    }
  }]
}

So far I've managed this (it returns countries' total revenue grouped by region) but I have no idea how to further divide it by categories. How do I achieve the required result?

[{
    "$group": {
      "_id": {
        "region": "$Region",
        "country": "$Country",
        "type": "$Category"
      },
      "Revenue": {
        "$sum": "$Total Revenue"
      }
    }
  },
  {
    "$group": {
      "_id": "$_id.region",
      "Revenue": {
        "$sum": "$Revenue"
      },
      "values": {
        "$push": {
          "country": "$_id.country",
          "Revenue": {
            "$sum": "$Revenue"
          }
        }
      }
    }
  },
  {
    "$project": {
      "values": {
        "$slice": [
          "$values",
          5
        ]
      }
    }
  },
  {
    "$sort": {
      "_id": 1
    }
  },
  {
    "$limit": 2
  }
]

The response for the above query looks like this:

{
  "_id": "Asia",
  "Countries": [{
      "country": "Sri Lanka",
      "Revenue": 731282793.7
    },
    {
      "country": "Vietnam",
      "Revenue": 634208761.24
    }
  ]
}

CodePudding user response:

You should have 3 $group stages. You miss out on the $group (2nd) stage for group by Region and Country.

{
  "$group": {
    "_id": {
      "region": "$_id.region",
      "country": "$_id.country"
    },
    "Categories": {
      $push: {
        k: "$_id.type",
        v: "$Revenue"
      }
    },
    "TotalRevenue": {
      "$sum": "$Revenue"
    }
  }
}

Complete query:

db.collection.aggregate([
  {
    "$group": {
      "_id": {
        "region": "$Region",
        "country": "$Country",
        "type": "$Category"
      },
      "Revenue": {
        "$sum": "$Total Revenue"
      }
    }
  },
  {
    "$group": {
      "_id": {
        "region": "$_id.region",
        "country": "$_id.country"
      },
      "Categories": {
        $push: {
          k: "$_id.type",
          v: "$Revenue"
        }
      },
      "TotalRevenue": {
        "$sum": "$Revenue"
      }
    }
  },
  {
    "$group": {
      "_id": "$_id.region",
      "Revenue": {
        "$sum": "$TotalRevenue"
      },
      "Countries": {
        "$push": {
          "country": "$_id.country",
          "TotalRevenue": "$TotalRevenue",
          "Categories": {
            $arrayToObject: "$Categories"
          }
        }
      }
    }
  },
  {
    "$project": {
      "Countries": {
        "$slice": [
          "$Countries",
          5
        ]
      }
    }
  },
  {
    "$sort": {
      "_id": 1
    }
  },
  {
    "$limit": 2
  }
])

Demo @ Mongo Playground

  • Related