I have the following script to return an aggregation hour by hour of last x days on ExecNom:
GET /mkt-with-time/_search
{
"size": 0,
"query": {
"range": {
"@timestamp": {
"gte": "now-7d/d",
"lte": "now-2d/d"
}
}
},
"aggs": {
"per_hour": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "hour"
},
"aggs": {
"avg_execnom": {
"avg": {
"field": "ExecNom"
}
}
}
}
}
}
this returns:
"aggregations": {
"per_hour": {
"buckets": [
{
"key_as_string": "2022-08-14T00:00:00.000Z",
"key": 1660435200000,
"doc_count": 3633,
"avg_execnom": {
"value": 252701.0352904938
}
},
{
"key_as_string": "2022-08-14T01:00:00.000Z",
"key": 1660438800000,
"doc_count": 3651,
"avg_execnom": {
"value": 258698.55131756698
}
},
...
{
"key_as_string": "2022-08-15T00:00:00.000Z",
"key": 1660521600000,
"doc_count": 3559,
"avg_execnom": {
"value": 248551.24137114585
}
},
{
"key_as_string": "2022-08-15T01:00:00.000Z",
"key": 1660525200000,
"doc_count": 3553,
"avg_execnom": {
"value": 250348.02511778387
}
},
{
"key_as_string": "2022-08-15T02:00:00.000Z",
"key": 1660528800000,
"doc_count": 3692,
"avg_execnom": {
"value": 258052.43654516252
}
},
...
]
}
}
I have data hour by hour for today: for the simplicity, I write 'today' as if it is today's date.
{
"key_as_string": "todayT02:00:00.000Z",
"key": 1660528800000,
"doc_count": 3692,
"avg_execnom": {
"value": 258052.43654516252
}
},
{
"key_as_string": "todayT03:00:00.000Z",
"key": 1660528800000,
"doc_count": 3692,
"avg_execnom": {
"value": 258052.43654516252
}
}
...
I am looking for a way to compare hour by hour, today's data versus the one of last X days. I have already today's data, how to find the hour by hour average for last X days.
To be explicit:
[
avg(2022-08-14T00:00:00, 2022-08-15T00:00:00, ..., 2022-08-22T00:00:00),
avg(2022-08-14T01:00:00, 2022-08-15T01:00:00, ..., 2022-08-22T01:00:00),
avg(2022-08-14T02:00:00, 2022-08-15T02:00:00, ..., 2022-08-22T02:00:00),
...
avg(2022-08-14T23:00:00, 2022-08-15T23:00:00, ..., 2022-08-22T23:00:00)
]
Do you have an idea how to achieve that?
Inspired from this: https://discuss.elastic.co/t/average-per-day-of-week-aggregation/124132/2
this is my last try but does not work:
GET /mkt-with-time/_search
{
"size": 0,
"aggs": {
"orders_per_hour_of_day": {
"terms": {
"script" : {
"lang": "painless",
"source": "doc['@timestamp'].value.getHour()"
},
"size": 24
},
"aggs": {
"dayOfWeek": {
"terms": {
"script": {
"lang": "painless",
"source": "doc['@timestamp'].value.getDayOfWeekEnum()"
}
},
"aggs": {
"total_execnom": {
"sum": {
"field": "ExecNom"
}
},
"number_of_weeks": {
"date_histogram": {
"field": "@timestamp",
"calendar_interval": "week"
}
},
"average_orders_per_hour_day_of_week": {
"bucket_selector": {
"buckets_path": {
"totalExecNom" : "total_execnom",
"number_of_weeks": "number_of_weeks._bucket_count"
},
"script": "params.totalExecNom / params.number_of_weeks"
}
}
}
}
}
}
}
}
Thanks!
CodePudding user response:
You can use runtime mapping for this kind of queries, so your query will look like this:
GET /mkt-with-time/_search
{
"size": 0,
"runtime_mappings": {
"hour_of_day": {
"type": "keyword",
"script": {
"source": "emit(doc['@timestamp'].value.getHour().toString())"
}
}
},
"aggs": {
"per_hour": {
"terms": {
"field": "hour_of_day",
"size":30
},
"aggs": {
"avg_per_dat": {
"avg": {
"field": "ExecNom"
}
}
}
}
}
}
In this case, in the first step, Elastic will build new field named "hour_of_day" contains hour number 1 -24, and in the aggregation you will use regular avg terms connection
CodePudding user response:
thank you very much for your reply @Vakhtang. I'm struggling to do a sort, to make the bucket 0->23 sorted. 0->23.
# with sort
GET /mkt-with-time/_search
{
"size": 0,
"runtime_mappings": {
"hour_of_day": {
"type": "keyword",
"script": {
"source": "emit(doc['@timestamp'].value.getHour().toString())"
}
}
},
"aggs": {
"per_hour": {
"terms": {
"field": "hour_of_day",
"size":30
},
"aggs": {
"avg_per_date": {
"avg": {
"field": "ExecNom"
}
}
},
"mysort":{
"bucket_sort": {
"sort": [
{"per_hour": {"order": "desc"}}
]
}
}
}
}
}
Yelling:
"reason": "Found two aggregation type definitions in [per_hour]: [terms] and [mysort]"