Home > database >  Elasticsearch Multimatch substring not working
Elasticsearch Multimatch substring not working

Time:07-29

So I have a record with following field :

"fullName" : "Virat Kohli"

I have written the following multi_match query that should fetch this record :

GET _search
{
  "query": {
    "multi_match": {
      "query": "*kohli*",
      "fields": [
        "fullName^1.0",
        "team^1.0"
      ],
      "type": "phrase_prefix",
      "operator": "OR",
      "slop": 0,
      "prefix_length": 0,
      "max_expansions": 50,
      "zero_terms_query": "NONE",
      "auto_generate_synonyms_phrase_query": true,
      "fuzzy_transpositions": true,
      "boost": 1
    }
  }
}

This works fine.

But when I remove the letter 'k' from query and change it to :

"query": "*ohli*"

It doesn't fetch any record.

Any reason why this is happening? How can I modify the query to get the record returned with the above modification?

CodePudding user response:

first let me explain you why your existing query didn't work and then the solution of it.

Problem : you are using the multi_match query with type phrase_prefix and as explained in the documentation it makes a prefix query on the last search term and in your case you have only 1 search term so on that Elasticsearch will perform the phrase query.

And prefix query works on the exact tokens, and you are using standard analyzer mostly, default for text fields so for fullName field it will have virat and kohli and your search term also generates kohli(notice smallcase k) as standard analyzer also lowercase the tokens, above you can check with the explain API output in your first request as shown below.

"_explanation": {
                    "value": 0.2876821,
                    "description": "max of:",
                    "details": [
                        {
                            "value": 0.2876821,
                            "description": "weight(fullName:kohli in 0) [PerFieldSimilarity], result of:",
                            "details": [
                                {

(note he search term in the weight)

Solution

As you are trying to use the wildcard in your query, best solution is to use the wildcard query against your field as shown below to get results in both case.

{
  "query": {
    "wildcard": {
      "fullName": {
        "value": "*ohli",
        "boost": 1.0,
        "rewrite": "constant_score"
      }
    }
  }
}

And SR

"hits": [
            {
                "_shard": "[match_query][0]",
                "_node": "BKVyHFTiSCeq4zzD-ZqMbA",
                "_index": "match_query",
                "_type": "_doc",
                "_id": "1",
                "_score": 1.0,
                "_source": {
                    "id": 2,
                    "fullName": "Virat Kohli",
                    "team": [
                        "Royal Challengers Bangalore",
                        "India"
                    ]
                },
                "_explanation": {
                    "value": 1.0,
                    "description": "fullName:*ohli",
                    "details": []
                }
            }
        ]
  • Related