I am reading the date histogram https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-datehistogram-aggregation.html, it shows how to aggregate based on intervals like month, hour etc.
But I have a requirement to aggregate them based on the daily hour. For example, I'd like to aggregate the date on 8am everyday in a week. It needs to aggregate all document happens between 8:00am to 8:59am from Monday to Sunday. The response should give me 7 buckets (one for each day in the week).
The response may look like. If I query one month data ( assume there are 4 weeks in this month ), the Monday bucket will include the sum of data between 8am to 9am on 4 Mondays in the month.
"buckets" : [
{
"key_as_string" : "MONDAY 8am",
"key" : 1641254400000,
"doc_count" : 5
},
{
"key_as_string" : "TUESDAY 8am",
"key" : 1641254400000,
"doc_count" : 5
}
...
{
"key_as_string" : "SUNDAY 8am",
"key" : 1641254400000,
"doc_count" : 5
}
Is there any way to achieve that in Elasticsearch?
CodePudding user response:
There is more than one way to do this with ES. The simplest is to make a runtime mapping that contains the weekday as a term. Aggregate on that and subaggregate by the hour with hard bounds for your required time. I havent tried this with real data, so pls tweak as necessary.
{
"runtime_mappings": {
"weekday": {
"type": "keyword",
"script": "
long date = doc['date'].value.toInstant().toEpochMilli();
ZonedDateTime zdt = ZonedDateTime.ofInstant(instant, ZoneId.of('Z'));
String format = "EEEE";
DateTimeFormatter formatter = DateTimeFormatter.ofPattern(format).withLocale(Locale.ENGLISH);
emit(formatter.format(zdt));
"
}
},
"aggs": {
"weekday": {
"terms": {
"field": "weekday"
},
"aggs" : {
"hourly" : {
"date_histogram" : {"field" : "date", "calendar_interval" : "hour", "hard_bounds" : {"min" : 8, "max" : 9}}
}
}
}
}
}