Home > Software engineering >  How to DSL query to search query string with asterisk after excluding predefined words
How to DSL query to search query string with asterisk after excluding predefined words

Time:07-28

my stop.txt is having messi

Settings is belows

{
  "settings": {
    "index": {
      "analysis": {
        "filter": {
          "synonym_en": {
            "type": "synonym",
            "synonyms_path": "synom.txt"
          },
          "english_stop": {
            "type": "stop",
            "stopwords_path": "stop.txt"
          }
        },
        "analyzer": {
          "english_analyzer": {
            "tokenizer": "standard",
            "filter": ["english_stop", "synonym_en"]
          }
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "name": {
        "type": "text",
        "analyzer": "english_analyzer"
      }
    }
  }
}

My dictionary is below

[
  { "id": 0, "name": "Messiis player" },
  { "id": 1, "name": "Messi player" },
  { "id": 2, "name": "Messi and Rono player" },
  { "id": 3, "name": "Rono and Messi player" },
  { "id": 4, "name": "messiis and Messi player" }
]

DSL query is below

{
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "messi*",
          "fields": ["name^128"]
        }
      }
    }
  }
}

My Out is below getting full document

{
  "took": 3,
  "timed_out": false,
  "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 },
  "hits": {
    "total": { "value": 5, "relation": "eq" },
    "max_score": 128.0,
    "hits": [
      {
        "_index": "player",
        "_type": "_doc",
        "_id": "0",
        "_score": 128.0,
        "_source": { "id": 0, "name": "Messiis player" }
      },
      {
        "_index": "player",
        "_type": "_doc",
        "_id": "1",
        "_score": 128.0,
        "_source": { "id": 1, "name": "Messi player" }
      },
      {
        "_index": "player",
        "_type": "_doc",
        "_id": "2",
        "_score": 128.0,
        "_source": { "id": 2, "name": "Messi and Rono player" }
      },
      {
        "_index": "player",
        "_type": "_doc",
        "_id": "3",
        "_score": 128.0,
        "_source": { "id": 3, "name": "Rono and Messi player" }
      },
      {
        "_index": "player",
        "_type": "_doc",
        "_id": "4",
        "_score": 128.0,
        "_source": { "id": 4, "name": "messiis and Messi player" }
      }
    ]
  }
}
  • My query have *

  • if i am searching for "query": "messi*", i am getting output {'id': 4, 'name': 'messiis and Messi player'}

  • if i am searching for "query": "messi*", I need expected out as below

  • if i am searching also "query": "Messi*", I need expected out as below(basically case has to insensensitive)

  • not getting where is the error occurs

{
  "took": 8,
  "timed_out": false,
  "_shards": { "total": 1, "successful": 1, "skipped": 0, "failed": 0 },
  "hits": {
    "total": { "value": 2, "relation": "eq" },
    "max_score": 128.0,
    "hits": [
      {
        "_index": "player",
        "_type": "_doc",
        "_id": "0",
        "_score": 128.0,
        "_source": { "id": 0, "name": "Messiis player" }
      },
      {
        "_index": "player",
        "_type": "_doc",
        "_id": "4",
        "_score": 128.0,
        "_source": { "id": 4, "name": "messiis and Messi player" }
      }
    ]
  }
}

CodePudding user response:

The problem is that your stop.txt file probably contains messi in lowercase and your english_analyzer doesn't lowercase your tokens.

So you have two options:

A. you can add Messi in your stop.txt file

B. you can add a lowercase token filter

        "analyzer": {
          "english_analyzer": {
            "tokenizer": "standard",
            "filter": ["lowercase", "english_stop", "synonym_en"]
                            ^
                            |
                        add this
          }
        }
      

Then it will work and remove all messi tokens (whatever the case)

CodePudding user response:

you can try this:

{
  "query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "messi",
          "default_field": "name",
          "default_operator":"OR"
        }
      }
    }
  }
}
  • Related