Home > Blockchain >  MongoDB - Get last date of every distinct name
MongoDB - Get last date of every distinct name

Time:09-23

I am starting MongoDB and have problems about how to create a query to filter documents by last date of every distinct name and retrieve the whole document.

I have some data into my collection (students):

{ "_id" : ObjectId("61479d4bc146b1663a8f2b7d"), "city" : "SAO PAULO", "name" : "ANA", "status" : "ACTIVE", "date1" : ISODate("2020-09-01T08:14:30.000Z") }
{ "_id" : ObjectId("61479d88c146b1663a8f2b7e"), "city" : "SAO PAULO", "name" : "MARIA", "status" : "ACTIVE", "date1" : ISODate("2020-08-01T04:16:00.000Z") }
{ "_id" : ObjectId("61479dc2c146b1663a8f2b7f"), "city" : "RIO DE JANEIRO", "name" : "MARIA", "status" : "ACTIVE", "date1" : ISODate("2021-02-01T11:10:00.000Z") }
{ "_id" : ObjectId("61479df1c146b1663a8f2b80"), "city" : "SAO PAULO", "name" : "MARIA", "status" : "INACTIVE", "date1" : ISODate("2021-02-01T11:15:00.000Z") }
{ "_id" : ObjectId("61479e60c146b1663a8f2b81"), "city" : "BRASILIA", "name" : "JOHH", "status" : "ACTIVE", "date1" : ISODate("2021-06-01T01:18:00.000Z") }

I'm creating a query to filter status "ACTIVE" and show only most recent data for each student, showing only "city", "name", "date" and I'm trying this one using $MAX or $LAST into the GROUP:

db.getCollection('students').aggregate([
   { $match: { status: "ACTIVE" } },
   { $group: { _id: { name : "$name"},
         date1 : { $max : "$date1" } ,
         city : { $max : "$city" } } }
])

The wanted result:

{ "city" : "SAO PAULO", "name" : "ANA", "date1" : ISODate("2020-09-01T08:14:30.000Z") }
{ "city" : "RIO DE JANEIRO", "name" : "MARIA", "date1" : ISODate("2021-02-01T11:10:00.000Z") }
{ "city" : "BRASILIA", "name" : "JOHH", "date1" : ISODate("2021-06-01T01:18:00.000Z") }

But the result is this:

{ "city" : "SAO PAULO", "name" : "ANA", "date1" : ISODate("2020-09-01T08:14:30.000Z") }
{ "city" : "SAO PAULO", "name" : "MARIA", "date1" : ISODate("2021-02-01T11:10:00.000Z") }
{ "city" : "BRASILIA", "name" : "JOHH", "date1" : ISODate("2021-06-01T01:18:00.000Z") }

It is retrieving wrong data. For ANA and JOHN (only one document each) it's ok. But MARIA has three documents and I need to retrieve all data from her document with the $max date and I'm retrieving "city" : "SAO PAULO" rather than "city" : "RIO DE JANEIRO" because operator $MAX is applied for this field too. That is applied for all fields and the GROUP operator does not allow removing the MAX operator.

I don't know to fix it. How to get whole document, filtering by "last date of every distinct name" ?

CodePudding user response:

You can use this aggregation pipeline:

  • First $match as you have.
  • Then $sort to get desired values in first position. This is used by next stage.
  • Into $group aggregation you get the $first value (as the document is sorted, the first value will be the desired one).
  • And last $project to get desired output.
db.collection.aggregate([
  {
    "$match": {
      "status": "ACTIVE"
    }
  },
  {
    "$sort": {
      "date1": -1
    }
  },
  {
    "$group": {
      "_id": {
        "name": "$name"
      },
      "date1": {
        "$first": "$date1"
      },
      "city": {
        "$first": "$city"
      }
    }
  },
  {
    "$project": {
      "_id": 0,
      "name": "$_id.name",
      "city": 1,
      "date1": 1
    }
  }
])

Example here

  • Related