Home > Software engineering >  Query to count the number of documents for each user
Query to count the number of documents for each user

Time:05-19

I have a collection named 'captures' and the documents within it have the field 'username'

a document looks something like this

/* 1 */
{
    "_id" : ObjectId("622b951a026ca3a73f5a2a1c"),
    "username" : "andre",
    "data" : {
         "metadata" : {
            "start" : "2022-02-24T09:32:22.390Z",
            ...
          },
          ...
    }
}
/* 2 */
{
    "_id" : ObjectId("9255941b026ca3a73f5a2a1c"),
    "username" : "andre",
    "data" : {
         "metadata" : {
            "start" : "2022-05-10T03:12:23.440Z",
            ...
          },
          ...
    }
}
/* 3 */
{
    "_id" : ObjectId("7775941b026ca3a73f5a2a1c"),
    "username" : "bob",
    "data" : {
         "metadata" : {
            "start" : "2022-05-16T12:24:12.002Z",
            ...
          },
          ...
    }
}
/* 4 */
{
    "_id" : ObjectId("3215331b026ca3a73f5a2a1c"),
    "username" : "bob",
    "data" : {
         "metadata" : {
            "start" : "2022-05-18T12:24:12.002Z",
            ...
          },
          ...
    }
}

I'd like to return a count of documents for each distinct username, where 'start' is after 2022-02-24T09:32:22.390Z

the above example would return something like:

{ "user" : "andre", "count" : 1 }
{ "user" : "bob", "count" : 2 }

I've tried using count, distinct, aggregate without success...

CodePudding user response:

This is pretty simple to do with the aggregation framework:

[
  {
    $project: {
      _id: 0,
      user: '$username',
      start: {
        $toDate: '$data.metadata.start'
      }
    }
  },
  {
    $match: {
      start: {
        $gt: Date('2022-02-24T09:32:22.390Z')
      }
    }
  },
  {
    $group: {
      _id: '$user',
      user: {
        $first: '$user'
      },
      count: {
        $sum: 1
      }
    }
  }
]

By the way you should store dates as Date objects, not strings, it will make your life easier.

  • Related