Home > Net >  MongoDB get count of field per season from MM/DD/YYYY date field
MongoDB get count of field per season from MM/DD/YYYY date field

Time:06-18

I am facing a problem in MongoDB. Suppose, I have the following collection.

{ id: 1, issueDate: "07/05/2021", code: "31" },
{ id: 2, issueDate: "12/11/2020", code: "14" },
{ id: 3, issueDate: "02/11/2021", code: "98" },
{ id: 4, issueDate: "01/02/2021", code: "14" },
{ id: 5, issueDate: "06/23/2020", code: "14" },
{ id: 6, issueDate: "07/01/2020", code: "31" },
{ id: 7, issueDate: "07/05/2022", code: "14" },
{ id: 8, issueDate: "07/02/2022", code: "20" },
{ id: 9, issueDate: "07/02/2022", code: "14" }

The date field is in the format MM/DD/YYYY. My goal is to get the count of items with each season (spring (March-May), summer (June-August), autumn (September-November) and winter (December-February).

The result I'm expecting is:

  1. count of fields for each season:

    { "_id" : "Summer", "count" : 6 } { "_id" : "Winter", "count" : 3 }

  2. top 2 codes (first and second most recurring) per season:

    { "_id" : "Summer", "codes" : {14, 31} } { "_id" : "Winter", "codes" : {14, 98} }

How can this be done?

CodePudding user response:

You should never store date/time values as string, store always proper Date objects.

You can use $setWindowFields opedrator for that:

db.collection.aggregate([
   // Convert string into Date
   { $set: { issueDate: { $dateFromString: { dateString: "$issueDate", format: "%m/%d/%Y" } } } },
   // Determine the season (0..3)
   {
      $set: {
         season: { $mod: [{ $toInt: { $divide: [{ $add: [{ $subtract: [{ $month: "$issueDate" }, 1] }, 1] }, 3] } }, 4] }
      }
   },
   // Count codes per season
   {
      $group: {
         _id: { season: "$season", code: "$code" },
         count: { $count: {} },
      }
   },
   // Rank occurrence of codes per season
   {
      $setWindowFields: {
         partitionBy: "$_id.season",
         sortBy: { count: -1 },
         output: {
            rank: { $denseRank: {} },
            count: { $sum: "$count" }
         }
      }
   },
   // Get only top 2 ranks
   { $match: { rank: { $lte: 2 } } },
   // Final grouping
   {
      $group: {
         _id: "$_id.season",
         count: { $first: "$count" },
         codes: { $push: "$_id.code" }
      }
   },
   // Some cosmetic for output
   {
      $set: {
         season: {
            $switch: {
               branches: [
                  { case: { $eq: ["$_id", 0] }, then: 'Winter' },
                  { case: { $eq: ["$_id", 1] }, then: 'Spring' },
                  { case: { $eq: ["$_id", 2] }, then: 'Summer' },
                  { case: { $eq: ["$_id", 3] }, then: 'Autumn' },
               ]
            }
         }
      }
   }
])

Mongo Playground

CodePudding user response:

I will give you clues,

  1. You need to use $group with _id as $month on issueDate, use accumulator $sum to get month wise count.
  2. You can divide month by 3, to get modulo, using $toInt, $divide, then put them into category using $cond.

CodePudding user response:

Another option:

db.collection.aggregate([
{
 $addFields: {
  "season": {
    $switch: {
      branches: [
        {
          case: {
            $in: [
              {
                $substr: [
                  "$issueDate",
                  0,
                  2
                ]
              },
              [
                "06",
                "07",
                "08"
              ]
            ]
          },
          then: "Summer"
        },
        {
          case: {
            $in: [
              {
                $substr: [
                  "$issueDate",
                  0,
                  2
                ]
              },
              [
                "03",
                "04",
                "05"
              ]
            ]
          },
          then: "Spring"
        },
        {
          case: {
            $in: [
              {
                $substr: [
                  "$issueDate",
                  0,
                  2
                ]
              },
              [
                "12",
                "01",
                "02"
              ]
            ]
          },
          then: "Winter"
        }
      ],
      default: "No date found."
    }
   }
  }
 },
{
 $group: {
  _id: {
    s: "$season",
    c: "$code"
  },
  cnt1: {
    $sum: 1
   }
  }
 },
 {
  $sort: {
    cnt1: -1
 }
},
{
 $group: {
  _id: "$_id.s",
  codes: {
    $push: "$_id.c"
  },
  cnt: {
    $sum: "$cnt1"
    }
  }
 },
 {
  $project: {
  _id: 0,
  season: "$_id",
  count: "$cnt",
  codes: {
    "$slice": [
      "$codes",
      2
     ]
    }
  }
 }
])

Explained:

  1. Add one more field for season based on $switch per month(extracted from issueDate string)
  2. Group to collect per season/code.
  3. $sort per code DESCENDING
  4. group per season to form an array with most recurring codes in descending order.
  5. Project the fields to the desired output and $slice the codes to limit only to the fist two most recurring.

Comment: Indeed keeping dates in string is not a good idea in general ...

Playground

  • Related