Home > Software design >  ElasticSearch searching for <NUM> analyzed fields not matching
ElasticSearch searching for <NUM> analyzed fields not matching

Time:10-29

I have an elasticsearch cluster with the standard analyzer. I understand that with this analyzer a term "300" is analyzed as a type.

Suppose I am searching for a document with a field "name" with value "Paper towel 300 CT" which is analyzed as ["paper"(ALPHANUM), "towel"(ALPHANUM), "300"(NUM), "ct"(ALPHANUM)]

Currently, when I use a fuzzy/wildcard query like such:

    "query":
    {
   "bool":{
      "must":[
          {
            "bool":{
               "should":[
                  {
                     "fuzzy":{
                        "ec_item_name":{
                           "value":"300CT"
                        }
                     }
                  },
                   {
                     "wildcard":{
                        "ec_item_name":{
                           "value":"3*0*0*C*T*"
                        }
                     }
                  }
               ],
               "minimum_should_match": 1
            }
         }
      ]
   }
}

The fuzzy query does not match, no matter how the fuzziness is tuned. I would like the term "300CT" to match on "300". The same goes for matching "300" on "300CT". Is there an analyzer or a way to implement a custom analyzer such that all terms are analyzed as to support this type of search? I am having some trouble finding documentation around this kind of behavior.

CodePudding user response:

Have you tried with the match query and the fuzziness option?

I have tried out with the standard analyzer and it works fine for me:

{
  "query": {
    "match": {
      "name": {
        "query": "300CT",
        "fuzziness": 2
      }
    }
  }
}

CodePudding user response:

Be careful with the fuzzy query, it does not analyze the query text before creating fuzzy alternative query terms for the search. There is a good article that explains fuzzy queries on the Elastic Blog, regarding the fuzzy query they state:

The elasticsearch fuzzy query type should generally be avoided.

That said, i cannot reproduce your example because the term 300CT matches 300 when i search with fuzziness 2. Try the following example in Kibana Dev Tools:

PUT fuzzytest

PUT fuzzytest/_doc/1
{
  "name": "Paper towel 300 CT"
}

POST fuzzytest/_explain/1
{
  "query": {
    "fuzzy": {
      "name": {
        "value": "300CT",
        "fuzziness": 2
      }
    }
  }
}
  • Related