Elasticsearch aggregation limitation-CodePudding

When I create an aggregate query what scope it is applied to: all entries in an index or just first 10000? For example, here is a response I got for a script metric aggregation:

{
    "took": 76,
    "timed_out": false,
    "_shards": {
        "total": 5,
        "successful": 5,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 10000,
            "relation": "gte"
        },
        "max_score": null,
        "hits": []
    },
    "aggregations": {
        "number_of_operations_in_progress": {
            "value": 2
        }
    }
}

hits->total->value is 10000 what makes me think that the aggregate function is applied to first 10000 entries only, not the whole data set in the index.

Is my understanding correct? If yes, is there a way to apply an aggregate function to all entries?

CodePudding user response：

Aggregations are always applied to the whole document set that is selected by the query.

hits.total.value only gives a hint at how many documents match the query, in this case more than 10K documents match the query.

CodePudding user response：

you can usr track_total_hits to control how the total number of hits should be tracked

POST index1/_search
{
  "track_total_hits": true,
  "query": {
    "match_all": {}
  },
  "aggs": {
    "groupbyk1": {
      "terms": {
        "field": "k1"
      }
    }
  }
}