Home > database >  How can I group by same field with multiple values?
How can I group by same field with multiple values?

Time:10-28

When I search for elasticsearch aggregation, I only find multi_terms which is used to group by multiple fields. But I am looking for how to group by one field with multiple values.

I have a field product which value can be fruit, electronic, veg, furniture etc. I like to group by all document whose value is either fruit or veg. How can I achieve that?

I am looking for a way without updating index mapping. Since the value of product is changed frequently, I need to support any combination of group field at runtime.

CodePudding user response:

You can use script along with terms aggregation.

I have used runtime_mapping which is a script , it will emit single value for product vegetable and fruit, which you can use to group data.

You can also index runtime fields to improve performance.

{
  "runtime_mappings": {
    "product_custom": {
      "type": "keyword",
      "script": {
        "source": """
                      if(doc["product.keyword"].value=="vegetable" 
                        || doc["product.keyword"].value=="fruit")
                      {
                        emit("vegetable/fruit");  
                      }
                      else
                      {
                        emit(doc["product.keyword"].value);
                      }
                  """
      }
    }
  },
  "aggs": {
    "product_custom": {
      "terms": {
        "field": "product_custom"
      }
    }
  }
}

Result

"aggregations" : {
    "product_custom" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [
        {
          "key" : "vegetable/fruit",
          "doc_count" : 3
        },
        {
          "key" : "electronic",
          "doc_count" : 1
        }
      ]
    }
  }

Update

Runtime fields are available from version 7.11 onwards

You can also use scripts in terms aggregation to achieve same

{
  "aggs": {
    "product_custom": {
      "terms": {
        "script": {
          "source": """
                     if(doc["product.keyword"].value=="vegetable" 
                        || doc["product.keyword"].value=="fruit")
                      {
                        return "vegetable/fruit";  
                      }
                      else
                      {
                        return doc["product.keyword"].value;
                      }
                    """
        }
      }
    }
  }
}

scripts and run/time mapping are slow as everything is being done at search time. You can add runtime field to your index with out need of recreating index. it will give better performance than script

  • Related