Home > Enterprise >  Remove stopwords or exclusion list while retrieving data from elasticsearch
Remove stopwords or exclusion list while retrieving data from elasticsearch

Time:06-18

  • I need to remove stop words and some words to exclude to search
  • if i m searching for comedy i should not get anything since its in exclusion list, i have added comedy in stopwords
  • what word i have added in stop words then i should not get that if it's matches irrespective of
  • if i am doing normal search then i should return the values present. like if i m searching for Jennifer
  • RIght now stopwords is not working for me, if i m searching and i m getting output, but i should not
  • I have to use query_string only in my dsl query

[ { 'id': 0, 'Title': 'Live1', 'US Gross': 146083, 'Worldwide Gross': 146083, 'US DVD Sales': None, 'Production Budget': 8000000, 'Release Date': 'Jun 12 1998', 'MPAA Rating': 'R', 'Running Time min': None, 'Distributor': 'Gramercy', 'Source': None, 'Major Genre': Comedy, 'Creative Type': None, 'Director': T and I, 'Rotten Tomatoes Rating': None, 'IMDB Rating': 6.1, 'IMDB Votes': 1071 }, { 'id': 1, 'Title': 'First Love, Last Rites', 'US Gross': 10876, 'Worldwide Gross': 10876, 'US DVD Sales': None, 'Production Budget': 300000, 'Release Date': 'Aug 07 1998', 'MPAA Rating': 'R', 'Running Time min': None, 'Distributor': 'Strand', 'Source': None, 'Major Genre': 'Drama', 'Creative Type': None, 'Director': Richard Jennifer, 'Rotten Tomatoes Rating': None, 'IMDB Rating': 6.9, 'IMDB Votes': 207 }]

settings is below

settings =   {
 "settings": {
   "analysis": {
     "analyzer": {
       "blogs_analyzer": {
         "type": "standard",
         "stopwords": ["and", "is","comedy"]
       }
     }
   }
 }
}

My Dsl query is below

{
"query": {
    "bool": {
      "must": {
        "query_string": {
          "query": "and",
          "fields": ["Title^24",
            "Major Genre^8","Director^2" ]
        }
      }
    }
  }}

CodePudding user response:

It seems like you have created blogs_analyzer but not assign to any of the field. So it will consider standard analyzer.

You need to create index mapping with below configuration and then it will not give you result:

{
  "settings": {
    "analysis": {
      "analyzer": {
        "blogs_analyzer": {
          "type": "standard",
          "stopwords": [
            "and",
            "is",
            "comedy"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      "Title": {
        "type": "text",
        "analyzer": "blogs_analyzer"
      },
      "Director": {
        "type": "text",
        "analyzer": "blogs_analyzer"
      },
      "Major Genre": {
        "type": "text",
        "analyzer": "blogs_analyzer"
      }
    }
  }
}

After above configuration, you can reindex your data and search with below query:

{
  "query": {
    "query_string": {
      "fields": ["Title","Director","Major Genre"], 
      "query": "and"
    }
  }
}
  • Related