I have a mapping in elasticsearch with a field analyzer having tokenizer:
"tokenizer": {
"3gram_tokenizer": {
"type": "nGram",
"min_gram": "3",
"max_gram": "3",
"token_chars": [
"letter",
"digit"
]
}
}
now I am trying to search a name = "avinash" in Elasticsearch with query = "acinash"
Es Query formed is:
{
"size": 5,
"query": {
"bool": {
"must": [
{
"multi_match": {
"query": "acinash",
"fields": [
"name"
],
"type": "best_fields",
"operator": "AND",
"slop": 0,
"fuzziness": "1",
"prefix_length": 0,
"max_expansions": 50,
"zero_terms_query": "NONE",
"auto_generate_synonyms_phrase_query": false,
"fuzzy_transpositions": false,
"boost": 1.0
}
}
],
"adjust_pure_negative": true,
"boost": 1.0
}
}
}
But in ES version 6.8 I am getting the desired result(because of fuzziness) i.e "avinash" from quering "acinash", but in ES version 7.1 I am not getting the result.
Same goes when tried to search "avinash" using "avinaah" in 6.8 i am getting results but in 7.1 i am not getting results
What ES does is it will convert it into tokens :[aci, cin, ina, nas, ash] which ideally should match with tokenised inverted index in ES with tokens : [avi, vin, ina, nas, ash].
But why is it not matching in 7.1?
CodePudding user response:
It's not related to ES version.
Update max_expansions to more than 50.
max_expansions : Maximum number of variations created.
With 3 grams letter & digits as token_chars, ideal max_expansion will be (26 alphabets 10 digits) * 3