Home > Enterprise >  How to group by date with interval in Mongo
How to group by date with interval in Mongo

Time:11-17

I have the following collection:

{
    _id: ObjectId("6192513e0fd6ec5ad80376a7"),
    user: "mark.1239",
    connection: ISODate("2021-11-09T23:55:40.342Z"),
    disconnection: ISODate("2021-11-10T01:10:40.342Z")
},
{
    _id: ObjectId("6192513e0fd6ec5ad80376a9"),
    user: "john.9874",
    connection: ISODate("2021-11-02T07:15:42.318Z"),
    disconnection: ISODate("2021-11-02T08:40:42.318Z")
},
...

I want to get each connected users by 15 minutes interval. So in this case, John is connected at 7:30 but also at 8:15.

I can group by connection or disconnection with the 15 minutes interval, but it won't take users connected with connection and disconnection fields outside the 15 minutes interval.

I actually have this but can't find how to modify it to achieve what I want

collection.aggregate([
   { 
      "$group": {
           "_id": {
              "$toDate": {
                  "$subtract": [
                       { "$toLong": "$connection" },
                       { "$mod": [ { "$toLong": "$connection" }, 1000 * 60 * 15 ] }
                   ]
               }
           },
        "count": { "$sum": 1 },
        "user": {"$addToSet":"$user" }
      }
   },
   { $sort: {_id: 1} }
])

Is it possible to achieve this with mongo only, without javascript ?

CodePudding user response:

Your conditions are not fully clear, have a look at this outline:

        connection      disconnection
————————————|————————————————|——————————————> t
                                        

 start   end
  x     x   |                |             | NO
        x   |           x    |             | YES/NO?
            |     x     x    |             | YES
            |     x          |    x        | YES/NO?
        x   |                |    x        | YES/NO?
            |                |    x    x   | NO

One approach is to generate intervals of 15 Minutes and then filter on these intervals. Could be this one:

db.collection.aggregate([
   {
      $group: {
         _id: null,
         data: { $push: "$$ROOT" },
         // Determine total min. and max. time. Might be replaces by static values
         min: { $min: "$connection" },
         max: { $max: "$disconnection" }
      }
   },
   {
      $set: {
         // Round min/max time to 15 Minute interval
         min: { $dateTrunc: { date: "$min", unit: "minute", binSize: 15 } },
         max: { $dateTrunc: { date: "$max", unit: "minute", binSize: 15 } }
      }
   },
   {
      $set: {
         // Get number of 15-Minute intervals between min and max
         steps: {
            $dateDiff: {
               startDate: "$min",
               endDate: "$max",
               unit: "minute"
            }
         }
      }
   },
   {
      $set: {
         // Generate the 15-Minute intervals
         intervals: {
            $map: {
               input: { $range: [0, "$steps", 15] },
               as: "t",
               in: {
                  start: {
                     $dateAdd: {
                        startDate: "$min",
                        unit: "minute",
                        amount: "$$t"
                     }
                  },
                  end: {
                     $dateAdd: {
                        startDate: "$min",
                        unit: "minute",
                        amount: { $add: ["$$t", 15] }
                     }
                  }
               }
            }
         }
      }
   },
   // Transpose array to documents
   { $unwind: "$intervals" },
   // Just some cosmetic
   { $project: { data: 1, start: "$intervals.start", end: "$intervals.end" } },
   {
      $set: {
         data: {
            // Filter matching connections (based on outline above)
            $filter: {
               input: "$data",
               cond: { $and: [{ $lte: ["$$this.connection", "$start"] }, { $gte: ["$$this.disconnection", "$end"] }] }
            }
         }
      }
   },
   // Skip all 15-Minute intervals without any connection
   { $match: { data: { $ne: [] } } },
   // Count users and some cosmetic
   {
      $project: {
         start:1,
         end: 1,
         count: { $size: "$data" },
         user: { $setUnion: "$data.user" }
      }
   }
])

Another approach could be $bucket but would be difficult, because it works only with numeric values, i.e. you have to twist around Date values.

Just a tip: Create a helper function

function between(ts, start, end) {
   let ret = {};
   if (typeof ts == "string") {
      ret["$expr"] = { $and: [{ $lte: ["$"   start, "$"   ts] }, { $gte: ["$"   end, "$"   ts] }] };
   } else {
      ret[start] = { $lte: ts };
      ret[end] = { $gte: ts };
   }
   return ret;
}

It correspond to SQL BETWEEN Operator, i.e. WHERE ts BETWEEN start AND end

Then you can compose conditions like

{ $match: between("start", "connection", "disconnection") }
{ $match: between(new Date(), "connection", "disconnection") }

and it makes it much easier to write your query.

  • Related