Home > OS >  How keyword type gets stored and analyzed in elastic search
How keyword type gets stored and analyzed in elastic search

Time:06-28

As per my understanding, Keyword type will not be analyzed and be stored as an exact term. For example "shut down" will get stored as "shut down" in elastic search whereas text type will analyze using default or custom analyzer(if specified) and it'll separate "shut down" as [shut , down] into two words and store it in ES. This applies for searching also. For searching a field which is of keyword type we have to search for it's exact term while for search a field of text type we can search any one or more of the terms present in the actual text.

I've an index named sample_index which has two fields - description of type keyword and message of type text

This is the mapping of the index named sample_index enter image description here

Query

POST sample_index/_search
{
  "query": {
    "query_string": {
      "query": "keyword"
    }
  }
}

This is the output of the above query: enter image description here

Here you can see that upon searching the word "keyword" which is present in description field (which is of keyword type), the results show up. But as per my understanding this is not possible right? because for keyword type , the whole text get's indexed as it is without getting split. How can this be possible or is something wrong with my understanding?

ES Version: 5.6.4

CodePudding user response:

Tldr;

In version 5.6 when using a query_string.

If no default_field is selected, it will turn to _all field.

The _all field is a concatenation of all the fields in the document.

The _all field is a special catch-all field which concatenates the values of all of the other fields into one big string, using space as a delimiter, which is then analyzed and indexed, but not stored. This means that it can be searched, but not retrieved.

This is why you have such a results

CodePudding user response:

Your message is actually of type text. Additionally, you have a field of type keyword on message, but not relevant for your search query.

Because you're using query_string, which searches on all fields by default, your search query will match the "keyword" word in your message of type text. This is why you're able to search for the word "keyword", because text types get analyzed.

From query_string documentation

default_field Defaults to the index.query.default_field index setting, which has a default value of *.

Fields documentation

It is often useful to index the same field in different ways for different purposes. This is the purpose of multi-fields. For instance, a string field could be mapped as a text field for full-text search, and as a keyword field for sorting or aggregations:

  • Related