When I search for elasticsearch aggregation, I only find multi_terms
which is used to group by multiple fields. But I am looking for how to group by one field with multiple values.
I have a field product
which value can be fruit
, electronic
, veg
, furniture
etc. I like to group by all document whose value is either fruit
or veg
. How can I achieve that?
I am looking for a way without updating index mapping. Since the value of product
is changed frequently, I need to support any combination of group field at runtime.
CodePudding user response:
You can use script along with terms aggregation.
I have used runtime_mapping which is a script , it will emit single value for product vegetable and fruit, which you can use to group data.
You can also index runtime fields to improve performance.
{
"runtime_mappings": {
"product_custom": {
"type": "keyword",
"script": {
"source": """
if(doc["product.keyword"].value=="vegetable"
|| doc["product.keyword"].value=="fruit")
{
emit("vegetable/fruit");
}
else
{
emit(doc["product.keyword"].value);
}
"""
}
}
},
"aggs": {
"product_custom": {
"terms": {
"field": "product_custom"
}
}
}
}
Result
"aggregations" : {
"product_custom" : {
"doc_count_error_upper_bound" : 0,
"sum_other_doc_count" : 0,
"buckets" : [
{
"key" : "vegetable/fruit",
"doc_count" : 3
},
{
"key" : "electronic",
"doc_count" : 1
}
]
}
}
Update
Runtime fields are available from version 7.11 onwards
You can also use scripts in terms aggregation to achieve same
{
"aggs": {
"product_custom": {
"terms": {
"script": {
"source": """
if(doc["product.keyword"].value=="vegetable"
|| doc["product.keyword"].value=="fruit")
{
return "vegetable/fruit";
}
else
{
return doc["product.keyword"].value;
}
"""
}
}
}
}
}
scripts and run/time mapping are slow as everything is being done at search time. You can add runtime field to your index with out need of recreating index. it will give better performance than script