Home > Blockchain >  How to return hour by hour average for the last X days?
How to return hour by hour average for the last X days?

Time:08-24

I have the following script to return an aggregation hour by hour of last x days on ExecNom:

GET /mkt-with-time/_search
{
  "size": 0,
  "query": {
    "range": {
      "@timestamp": {
        "gte": "now-7d/d",
        "lte": "now-2d/d"
      }
   }
  },
  "aggs": {
        "per_hour": {
            "date_histogram": {
                "field": "@timestamp",
                "calendar_interval": "hour"
            },
            "aggs": {
                "avg_execnom": {
                    "avg": {
                        "field": "ExecNom"
                    }
                }
            }
        }
    }
}

this returns:

"aggregations": {
    "per_hour": {
      "buckets": [
        {
          "key_as_string": "2022-08-14T00:00:00.000Z",
          "key": 1660435200000,
          "doc_count": 3633,
          "avg_execnom": {
            "value": 252701.0352904938
          }
        },
        {
          "key_as_string": "2022-08-14T01:00:00.000Z",
          "key": 1660438800000,
          "doc_count": 3651,
          "avg_execnom": {
            "value": 258698.55131756698
          }
        },
        ...
        {
          "key_as_string": "2022-08-15T00:00:00.000Z",
          "key": 1660521600000,
          "doc_count": 3559,
          "avg_execnom": {
            "value": 248551.24137114585
          }
        },
        {
          "key_as_string": "2022-08-15T01:00:00.000Z",
          "key": 1660525200000,
          "doc_count": 3553,
          "avg_execnom": {
            "value": 250348.02511778387
          }
        },
        {
          "key_as_string": "2022-08-15T02:00:00.000Z",
          "key": 1660528800000,
          "doc_count": 3692,
          "avg_execnom": {
            "value": 258052.43654516252
          }
        },
        ...
        ]
    }
}

I have data hour by hour for today: for the simplicity, I write 'today' as if it is today's date.

{
  "key_as_string": "todayT02:00:00.000Z",
  "key": 1660528800000,
  "doc_count": 3692,
  "avg_execnom": {
    "value": 258052.43654516252
  }
},

{
  "key_as_string": "todayT03:00:00.000Z",
  "key": 1660528800000,
  "doc_count": 3692,
  "avg_execnom": {
    "value": 258052.43654516252
  }
}
...

I am looking for a way to compare hour by hour, today's data versus the one of last X days. I have already today's data, how to find the hour by hour average for last X days.

To be explicit:

[
    avg(2022-08-14T00:00:00, 2022-08-15T00:00:00, ..., 2022-08-22T00:00:00),
    avg(2022-08-14T01:00:00, 2022-08-15T01:00:00, ..., 2022-08-22T01:00:00),
    avg(2022-08-14T02:00:00, 2022-08-15T02:00:00, ..., 2022-08-22T02:00:00),
    ...
    avg(2022-08-14T23:00:00, 2022-08-15T23:00:00, ..., 2022-08-22T23:00:00)
]

Do you have an idea how to achieve that?

Inspired from this: https://discuss.elastic.co/t/average-per-day-of-week-aggregation/124132/2

this is my last try but does not work:


GET /mkt-with-time/_search
{
  "size": 0,
  "aggs": {
    "orders_per_hour_of_day": {
      "terms": {
        "script" : {
            "lang": "painless",
            "source": "doc['@timestamp'].value.getHour()"
        },
        "size": 24
      },
      "aggs": {
        "dayOfWeek": {
          "terms": {
            "script": {
              "lang": "painless",
              "source": "doc['@timestamp'].value.getDayOfWeekEnum()"
            }
          },
          "aggs": {
            "total_execnom": {
              "sum": {
                "field": "ExecNom"
              }
            },
            "number_of_weeks": {
              "date_histogram": {
                "field": "@timestamp",
                "calendar_interval": "week"
              }
            },
            "average_orders_per_hour_day_of_week": {
              "bucket_selector": {
                "buckets_path": {
                  "totalExecNom" : "total_execnom",
                  "number_of_weeks": "number_of_weeks._bucket_count"
                },
                "script": "params.totalExecNom / params.number_of_weeks"
              }
            }
          }
        }
      }
    }
  }
}

Thanks!

CodePudding user response:

You can use runtime mapping for this kind of queries, so your query will look like this:

GET /mkt-with-time/_search
{
  "size": 0,
  "runtime_mappings": {
    "hour_of_day": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['@timestamp'].value.getHour().toString())"
      }
    }
  },
  "aggs": {
    "per_hour": {
      "terms": {
        "field": "hour_of_day",
        "size":30
      },
      "aggs": {
        "avg_per_dat": {
          "avg": {
            "field": "ExecNom"
          }
        }
      }
    }
  }
}

In this case, in the first step, Elastic will build new field named "hour_of_day" contains hour number 1 -24, and in the aggregation you will use regular avg terms connection

CodePudding user response:

thank you very much for your reply @Vakhtang. I'm struggling to do a sort, to make the bucket 0->23 sorted. 0->23.

# with sort
GET /mkt-with-time/_search
{
  "size": 0,
  "runtime_mappings": {
    "hour_of_day": {
      "type": "keyword",
      "script": {
        "source": "emit(doc['@timestamp'].value.getHour().toString())"
      }
    }
  },
  "aggs": {
    "per_hour": {
      "terms": {
        "field": "hour_of_day",
        "size":30
      },
      
      "aggs": {
        "avg_per_date": {
          "avg": {
            "field": "ExecNom"
          }
        }
      },
      "mysort":{
          "bucket_sort": {
            "sort": [ 
              {"per_hour": {"order": "desc"}}
            ]
          }
        }
    }
  }
}

Yelling:

"reason": "Found two aggregation type definitions in [per_hour]: [terms] and [mysort]"

  • Related