Home > Software engineering >  gain over time in mongodb
gain over time in mongodb

Time:01-02

How is change in a value over time calculated in mongodb?

Streaming values over time.
Streaming data is collected at sub-second random time intervals into individual documents.
Streamed data is grouped and averaged over 1 minute time group.
Goal is to compare each value to the one minute average one hour later.

Example data:

[
  {
    _id: ObjectId("63a318c36ccc42d2330fae5e"),
    timestamp: ISODate("2022-12-21T14:30:31.172Z"),
    value: 3.8
  },
  {
    _id: ObjectId("63a318c46ccc42d2330fae8d"),
    timestamp: ISODate("2022-12-21T14:30:32.189Z"),
    value: 4.0
  },
  {
    _id: ObjectId("63a318c36ccc42d2330fae5e"),
    timestamp: ISODate("2022-12-21T15:30:14.025Z"),
    value: 5.0
  },  
  {
    _id: ObjectId("63a318c36ccc42d2330fae5e"),
    timestamp: ISODate("2022-12-21T15:30:18.025Z"),
    value: 5.5
  }
]

values grouped and averaged in one minute groups:

{$group:{_id:{
        "code": "$code",
        "year": { "$year": "$timestamp" },
        "dayOfYear": { "$dayOfYear": "$timestamp" },
        "hour": { "$hour": "$timestamp" },
        "minute":{$minute:"$timestamp"}
        },
        value:{$avg:"$value"},
        timestamp:{$first:"$timestamp"},

this gets close to the goal, but aggregates all the prices over an hour interval:

{$group:{_id:{
        "code": "$code",
        "year": { "$year": "$timestamp" },
        "dayOfYear": { "$dayOfYear": "$timestamp" },
        "hour": { "$hour": "$timestamp" }
        },
        value:{$first:"$value"},
        valueLast:{$last:"$value"},     
        timestamp:{$first:"$timestamp"},
        }
},

Instead, I want to look at change in the individual documents That is, what is the 14:30 value at 15:30, and what is the 15:35 value at 16:35: How do I compare a value to one hour later for each document?

[
  {
    _id: ObjectId("63a318c36ccc42d2330fae5e"),
    timestamp: ISODate("2022-12-21T14:30:31.172Z"),
    value: 3.8,
    valueLast: 5.25,
    gainPct: .382
  },
  {
    _id: ObjectId("63a318c46ccc42d2330fae8d"),
    timestamp: ISODate("2022-12-21T14:30:32.189Z"),
    value: 4.0,
    valueLast: 5.25,
    gainPct: .313
  },
]

CodePudding user response:

One option is to use $setWindowFields with time range for this: It allows you to group by code sort by cleanTimeStamp and preform an accumulation function ($avg) on all document within a (time) range from your current document (each document in context):

db.collection.aggregate([
  {$set: {
      cleanTimeStamp: {
        $dateTrunc: {
          date: "$timestamp",
          unit: "minute"
        }
      }
  }},
  {$setWindowFields: {
      partitionBy: "$code",
      sortBy: {cleanTimeStamp: 1},
      output: {
        valueLast: {
          $avg: "$value",
          window: {range: [59, 60], unit: "minute"}
        }
      }
  }},
  {$set: {
      gainPct: {$round: [{$divide: [{$subtract: ["$valueLast", "$value"]}, "$value"]}, 3]},
      cleanTimeStamp: "$$REMOVE"
    }
  }
])

See how it works on the playground example

It is not clear to me if you want the result for each document or for a specific timestamp. If you only want the query to return results for a specific minute, you can add one more step of $match, as a first step, to limit the context of your documents to be between the wanted timestamp and 1 hour after it.

  • Related