Home > Enterprise >  Finding sum of the "key" values in bucket aggregations in Elasticsearch
Finding sum of the "key" values in bucket aggregations in Elasticsearch

Time:03-04

I have the following ES query:

GET database/_search
{
  "from": 0,
  "size": 0,
  "query": {
    "bool": {
      "must": [
        {
          "nested": {
            "query": {
              "term": {
                "colleges.institution_full_name": {
                  "value": "Academy of Sciences",
                  "boost": 1.0
                }
              }
            },
            "path": "colleges"
          }
        }
      ]
    }
  },
  "_source": false,
  "aggs": {
    "publication_years": {
      "terms": {
        "field": "publication_year"
      }
    }
  }
}

And I got the following response:

{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 232,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "publication_years" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 2016,
          "doc_count" : 119
        },
        {
          "key" : 2017,
          "doc_count" : 90
        },
        {
          "key" : 2018,
          "doc_count" : 22
        },
        {
          "key" : 2019,
          "doc_count" : 1
        }
      ]
    }
  }
}

Now I want to calculate the average of the key values of publication years, i.e, average of 2016, 2017, 2018 & 2019. So how can I modify my ES query to get the average of publication years instead of getting every year individually. I tried using "avg" aggregation, but its also taking "doc_count" in consideration while calculating the average.

CodePudding user response:

try it

POST database/_search
{
  "size": 0, 
  "aggs": {
    "groupByYear": {
      "terms": {
        "field": "publication_year"
      },
      "aggs": {
        "avgYear": {
          "avg": {
            "field": "publication_year"
          }
        }
      }
    },
    "avg_year": {
      "avg_bucket": {
        "buckets_path": "groupByYear>avgYear" 
      }
    }
  }
}

CodePudding user response:

It's not clear what you want, do your want avg of 2016,2017,2018,2019? it means you want 2017.5?

  • Related