Home > Enterprise >  MongoDB aggregation structure causing some values to return as none
MongoDB aggregation structure causing some values to return as none

Time:06-22

I have written a simple function using mongoDB aggregation pipelines. The issue occurs when attempting to run it. For some reason min_price and max_price both return as none instead of a real value. The individual documents are formatted as so,

{"_id":{"$oid":"62afc6584c7476a02643b61f"},"item_name":"Titanic Super Heavy Leggings §6✪§6✪§6✪§6✪§6✪","auction_id":"516d82192fef4c3aa178b134e75fc0e1","seller":"dae37eba5f594922bc8d79b9a78f8543","seller_profile":"585d9e30f644496cafa0b36b555ef788","buyer":"7a3699ca1eef40929c714362ab35915a","timestamp":{"$numberLong":"1655686642260"},"price":{"$numberInt":"1000000"},"bin":true}

Below is the function used to calculate min and max and sorting them by date.

def min_max_volume(item_name):
    db = client.test
    mycol = db["ended"]
    pipeline = [
        {"$match": {"item_name": item_name}},
        {"$project": {
            'datetime': { '$dateToString': { 'format': "%Y-%m-%d", 'date': { "$toDate": "$timestamp"}}},
        }},
        {"$group": {
            '_id': '$datetime',
            'min_price': {'$min': '$price'},
            'max_price': {'$max': '$price'},
        }}
    ]
    results = mycol.aggregate(pipeline)
    for i in results:
        print(i)

The expected output should look like this,

{'_id': '2022-06-20', 'min_price': 345839475, 'max_price': 48534875}
{'_id': '2022-06-21', 'min_price': 456567, 'max_price': 348573945}
{'_id': '2022-06-22', 'min_price': 6486956, 'max_price': 12938291}

but instead does not contain values in the min_price and max_price fields.

{'_id': '2022-06-20', 'min_price': None, 'max_price': None}
{'_id': '2022-06-21', 'min_price': None, 'max_price': None}
{'_id': '2022-06-22', 'min_price': None, 'max_price': None}

CodePudding user response:

price field is missing in $project stage

CodePudding user response:

The output of first stage of pipeline will be the input of second stage. Likewise it continues. In the second stage i.e. in project, you missed to add price. Then, price is not in the group stage to query.

 pipeline = [
            {"$match": {"item_name": item_name}},
            {"$project": {
                'datetime': { '$dateToString': { 'format': "%Y-%m-%d", 'date': { "$toDate": "$timestamp"}}}, 'price':1
            }},
            {"$group": {
                '_id': '$datetime',
                'min_price': {'$min': '$price'},
                'max_price': {'$max': '$price'},
            }}
        ]
  • Related