Home > Software engineering >  Elasticsearch, aggregate by array field's value
Elasticsearch, aggregate by array field's value

Time:12-11

My documents looks like this:

[
    {
        'user_id': 1,
        'search_text': 'hi',
        'searched_departments': ["dep4", "dep5", "dep6"]
    },
    {
        'user_id': 1,
        'search_text': 'hi there',
        'searched_departments': ["dep4", "dep6"]
    },
    {
        'user_id': 5,
        'search_text': 'I like candy',
        'searched_departments': ["dep4", "dep11", "dep999"]
    },
    {
        'user_id': 2,
        'search_text': 'hi',
        'searched_departments': ["dep4", "dep6", "dep7"]
    }
]

I want to do an aggregation that returns the count of each department, so in this case I want my end result to be something like:

{
"dep4" : 4,
"dep6" : 3,
"dep5" : 1,
# and so on
}

my mapping:

{'mappings': {'properties': {'date': {'type': 'date'},
                             'searched_departments': {'type': 'text'},
                             'search_text': {'type': 'text'},
                             'user_id': {'type': 'text'}}}

CodePudding user response:

You can not get aggregation on text type of field (if you still want to generate aggregation on text type of field then field should be enable with fielddata parameter to true). For getting aggregation, you can define searched_departments field as multi field with keyword and text both type.

Below is Sample Mapping:

{
    "mappings": {
        "properties": {
            "date": {
                "type": "date"
            },
            "searched_departments": {
                "type": "text",
                "fields": {
                    "keyword": {
                        "type": "keyword",
                        "ignore_above": 256
                    }
                }
            },
            "search_text": {
                "type": "text"
            },
            "user_id": {
                "type": "text"
            }
        }
    }
}

Then below query will give you your expected result:

POST index_name/_search
{
  "query": {
    "match_all": {}
  },
  "aggs": {
    "dept_count": {
      "terms": {
        "field": "searched_departments.keyword",
        "size": 10
      }
    }
  }
}
  • Related