Suppose that we are hat maker and have an Elasticsearch index where each document corresponds to the sale of one hat. Part of the sales record is the name of the store at which the hat was sold. I want to find out the number of hats sold by each store, and the average number of hats sold over all stores. The best way I have been able to figure out is this search:
GET hat_sales/_search
{
"size": 0,
"query": {"match_all": {}},
"aggs": {
"stores": {
"terms": {
"field": "storename",
"size": 65536
},
"aggs": {
"sales_count": {
"cardinality": {
"field": "_id"
}
}
}
},
"average_sales_count": {
"avg_bucket": {
"buckets_path": "stores>sales_count"
}
}
}
}
(Aside: I set the size to 65536 because that is the default maximum number of buckets.)
The problem with this query is that the sales_count
aggregation performs a redundant calculation: each stores
bucket already has a doc_count
property. But how can I access this doc_count
in a buckets path?
CodePudding user response:
I think this is what you are after
PUT hat_sales
{
"mappings": {
"properties": {
"storename": {
"type": "keyword"
}
}
}
}
POST hat_sales/_bulk?refresh=true
{"index": {}}
{"storename": "foo"}
{"index": {}}
{"storename": "foo"}
{"index": {}}
{"storename": "bar"}
{"index": {}}
{"storename": "baz"}
{"index": {}}
{"storename": "baz"}
{"index": {}}
{"storename": "baz"}
GET hat_sales/_search
{
"size": 0,
"query": {"match_all": {}},
"aggs": {
"stores": {
"terms": {
"field": "storename",
"size": 65536
}
},
"average_sales_count": {
"avg_bucket": {
"buckets_path": "stores>_count"
}
}
}
}
The path to get to the _count
is stores>_count
The results look like:
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 6,
"relation" : "eq"
},
"max_score" : null,
"hits" : [ ]
},
"aggregations" : {
"stores" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "baz",
"doc_count" : 3
},
{
"key" : "foo",
"doc_count" : 2
},
{
"key" : "bar",
"doc_count" : 1
}
]
},
"average_sales_count" : {
"value" : 2.0
}
}
}