Home > Back-end >  MongoDB | Listing the Zip Code/Postal Code for each City in New York
MongoDB | Listing the Zip Code/Postal Code for each City in New York

Time:11-15

I am learning how to write MongoDB queries as part of my personal project. I think I am missing something small or big (I don't know), and hoping that someone could point to what is wrong with my MongoDB query. I am working on the popular zip code data set that is provided on MongoDB here: https://docs.mongodb.com/manual/tutorial/aggregation-zip-code-data-set/

An example of a document in the zipcodes collection has the following form:

{
  "_id": "10280",
  "city": "NEW YORK",
  "state": "NY",
  "pop": 5574,
  "loc": [
    -74.016323,
    40.710537
  ]
}

Goal: I am trying to find the population of each city in the state of New York (NY) where the results list the postal codes that are included in each city. The output should look something like this:

{
    _id : "ELMIRA",
    population : <population of ELMIRA>,
    postal_codes : [
        <PostalCode1 in ELMIRE>,
        <PostalCode2 in ELMIRE>,
        <PostalCode3 in ELMIRA>
        ]
    },
    {
        _id : "WHITESVILLE",
        population : <population of WHITESVILLE>,
        postal_codes : [
            <PostalCode1 in WHITESVILLE>
        ]
    },

My Code: Here is what I have written so far

db.getCollection('codes').aggregate([
    { 
        $group: {
            _id: {city: "$city", state:"$state"},
            population: {$sum: "$pop"},
            postal_codes: {$addToSet: "$_id"}
        }
    },
    {
        $match: {
            "_id.state":"NY"
        
        }
    
    },
    {
        $group: {
            _id: "$_id.city",
            population: {$last: "$population"},
            postal_codes: {$last: "$postal_codes"}
        }
    }       
])

Problem - The issue that I see with my code is that it only lists one of the zip codes for a city in New York that has multiple zip codes, as shown below. In the U.S., a city may overlap several postal codes. How can I fix my query to include all zip codes for each city in New York instead of one of them?

{ 
    "_id" : "OSSINING", 
    "population" : NumberInt(29926), 
    "postal_codes" : [
        "10562"
    ]
}
{ 
    "_id" : "COLD SPRING", 
    "population" : NumberInt(4904), 
    "postal_codes" : [
        "10516"
    ]
}
{ 
    "_id" : "NORTH WOODMERE", 
    "population" : NumberInt(19582), 
    "postal_codes" : [
        "11581"
    ]
}

CodePudding user response:

You can simply use $addToset instead of $last to get all the postal_codes.

One side note: you can make your code a bit more performant by reducing the intermediate result set through pushing the $match by state to earlier stage.

db.collection.aggregate([
  {
    $match: {
      "state": "NY"
    }
  },
  {
    $group: {
      _id: "$city",
      population: {
        $sum: "$pop"
      },
      postal_codes: {
        $addToSet: "$_id"
      }
    }
  }
])

Here is the Mongo playground for your reference.

  • Related