in one of my elasticsearch queries I am performing a query aggregation, and I found out that the total number of doc_count of the buckets is > than the total number of hits. (In the example, its 2085697 total hits vs 3071915 total bucket doc_counts.) Is this normal? Previously I assumed that total hits would always be = total bucket doc_counts, or greater if the field name precised in the aggregation is not found in some of the entries.
CodePudding user response:
If the field you're aggregating on contains an array of values, it is definitely possible.
For instance, let's say you have the following document:
{
"result_type": [1, 2]
}
If you aggregate on the result_type
field, you'll get the following response, i.e. hits.total.value = 1
(i.e. one document), but two buckets with doc_count = 1
.
{
"took" : 2,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"resultType" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : 1,
"doc_count" : 1
},
{
"key" : 2,
"doc_count" : 1
}
]
}
}
}