Home > Back-end >  Modify Input Dynamically to Search against data stored in Elastic Search
Modify Input Dynamically to Search against data stored in Elastic Search

Time:01-14

I am new to Elastic Search, have been reading a lot about it, but I stumbled at one requirement.

Consider a field of type text in all the documents in a index be "app_data"

Now app_data field always stores one word but that word can be an alphanumeric , numeric, alphabetic.

Requirement -

One type of word stored in app_data looks something like -

app_data:"99IPAB999999FG"

Now if the user wants to search for this app_data they enter something like

"99.IPAB.99999.9"

Another Example - Data in index

app_data:"78IGDB900459JI" User searches like - "78.IGDB.90045.9"

How should I form a ES query to match the data stored in the index docs if this is feasible?

Considerations -

  1. I cannot edit the data (using a custom analyser) during insertion to the index as app_data can have simple words like "RED", "RED567".
  2. Only for the problem mentioned above, I think I have to use a custom analyser along with query DSL.

CodePudding user response:

Assuming that your data is already indexed and cannot be changed, my suggestion is that you apply a pattern before sending the term to the query, removing the ".", 99.IPAB.99999.9 -> 99ipab999999.

With this, you can successfully apply the match_phrase_prefix.

If you cannot apply the pattern to the input, you can do so at search-time "search_analyzer".

The proposal will be to create a parser that generates the token without the ".". In your query add "analyzer":"my_analyzer" that the token will be generated without the "." and the match will work.

New analyzer:

PUT my-index-000001
{
  "settings": {
    "analysis": {
      "char_filter": {
        "my_char_filter": {
          "type": "pattern_replace",
          "pattern": """\.""",
          "replacement": ""
        }
      },
      "analyzer": {
        "my_analyzer": {
          "tokenizer": "standard",
          "filter": [
            "lowercase"
          ],
          "char_filter": [
            "my_char_filter"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "app_data": {
        "type": "text",
        "analyzer": "standard"
      }
    }
  }
}

Query

POST my-index-000001/_bulk
{"index":{}}
{"app_data":"99IPAB999999FG"}
{"index":{}}
{"app_data":"78IGDB900459JI"}

POST my-index-000001/_search
{
  "from": 0,
  "size": 5,
  "query": {
   "match_phrase_prefix": {
     "app_data": {
       "query": "78.IGDB.90045.9",
       "analyzer": "my_analyzer"
     }
   }
  }
}

Hits

"hits": [
  {
    "_index": "my-index-000001",
    "_id": "_jNxfoUBQB-6H-4Z6KWM",
    "_score": 0.6931471,
    "_source": {
      "app_data": "78IGDB900459JI"
    }
  }
]
  • Related