Home > database >  Is it possible for total doc_count of aggregation buckets to be greater than the total hits value?
Is it possible for total doc_count of aggregation buckets to be greater than the total hits value?

Time:09-17

in one of my elasticsearch queries I am performing a query aggregation, and I found out that the total number of doc_count of the buckets is > than the total number of hits. (In the example, its 2085697 total hits vs 3071915 total bucket doc_counts.) Is this normal? Previously I assumed that total hits would always be = total bucket doc_counts, or greater if the field name precised in the aggregation is not found in some of the entries.

enter image description here

CodePudding user response:

If the field you're aggregating on contains an array of values, it is definitely possible.

For instance, let's say you have the following document:

{
   "result_type": [1, 2]
}

If you aggregate on the result_type field, you'll get the following response, i.e. hits.total.value = 1 (i.e. one document), but two buckets with doc_count = 1.

{
  "took" : 2,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "resultType" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : 1,
          "doc_count" : 1
        },
        {
          "key" : 2,
          "doc_count" : 1
        }
      ]
    }
  }
}
  • Related