How is change in a value over time calculated in mongodb?
Streaming values over time.
Streaming data is collected at sub-second random time intervals into individual documents.
Streamed data is grouped and averaged over 1 minute time group.
Goal is to compare each value to the one minute average one hour later.
Example data:
[
{
_id: ObjectId("63a318c36ccc42d2330fae5e"),
timestamp: ISODate("2022-12-21T14:30:31.172Z"),
value: 3.8
},
{
_id: ObjectId("63a318c46ccc42d2330fae8d"),
timestamp: ISODate("2022-12-21T14:30:32.189Z"),
value: 4.0
},
{
_id: ObjectId("63a318c36ccc42d2330fae5e"),
timestamp: ISODate("2022-12-21T15:30:14.025Z"),
value: 5.0
},
{
_id: ObjectId("63a318c36ccc42d2330fae5e"),
timestamp: ISODate("2022-12-21T15:30:18.025Z"),
value: 5.5
}
]
values grouped and averaged in one minute groups:
{$group:{_id:{
"code": "$code",
"year": { "$year": "$timestamp" },
"dayOfYear": { "$dayOfYear": "$timestamp" },
"hour": { "$hour": "$timestamp" },
"minute":{$minute:"$timestamp"}
},
value:{$avg:"$value"},
timestamp:{$first:"$timestamp"},
this gets close to the goal, but aggregates all the prices over an hour interval:
{$group:{_id:{
"code": "$code",
"year": { "$year": "$timestamp" },
"dayOfYear": { "$dayOfYear": "$timestamp" },
"hour": { "$hour": "$timestamp" }
},
value:{$first:"$value"},
valueLast:{$last:"$value"},
timestamp:{$first:"$timestamp"},
}
},
Instead, I want to look at change in the individual documents That is, what is the 14:30 value at 15:30, and what is the 15:35 value at 16:35: How do I compare a value to one hour later for each document?
[
{
_id: ObjectId("63a318c36ccc42d2330fae5e"),
timestamp: ISODate("2022-12-21T14:30:31.172Z"),
value: 3.8,
valueLast: 5.25,
gainPct: .382
},
{
_id: ObjectId("63a318c46ccc42d2330fae8d"),
timestamp: ISODate("2022-12-21T14:30:32.189Z"),
value: 4.0,
valueLast: 5.25,
gainPct: .313
},
]
CodePudding user response:
One option is to use $setWindowFields
with time range for this:
It allows you to group by code
sort by cleanTimeStamp
and preform an accumulation function ($avg
) on all document within a (time) range from your current document (each document in context):
db.collection.aggregate([
{$set: {
cleanTimeStamp: {
$dateTrunc: {
date: "$timestamp",
unit: "minute"
}
}
}},
{$setWindowFields: {
partitionBy: "$code",
sortBy: {cleanTimeStamp: 1},
output: {
valueLast: {
$avg: "$value",
window: {range: [59, 60], unit: "minute"}
}
}
}},
{$set: {
gainPct: {$round: [{$divide: [{$subtract: ["$valueLast", "$value"]}, "$value"]}, 3]},
cleanTimeStamp: "$$REMOVE"
}
}
])
See how it works on the playground example
It is not clear to me if you want the result for each document or for a specific timestamp. If you only want the query to return results for a specific minute, you can add one more step of $match
, as a first step, to limit the context of your documents to be between the wanted timestamp and 1 hour after it.