Home > Enterprise >  How to put multiple settings in elasticsearch
How to put multiple settings in elasticsearch

Time:06-30

My dict is below

stopwords.txt have all the stopwords

and synonym.txt have football,soccer

abc = [
{'id':1, 'name': 'christiano ronaldo', 'description': '[email protected]', 'type': 'football'},
{'id':2, 'name': 'lionel messi', 'description': '[email protected]','type': 'soccer'},
{'id':3, 'name': 'sachin', 'description': 'was', 'type': 'cricket'}
]
  • I have two txt files stopwords.txt and synonym.txt
  • If I am searching stopwords then those document should not return
  • I need to apply settings on name and description
resp = es.search(index="players",body={
"query": {
"query_string": {
"fields": ["name^2","description^2"],
"query": "was football*"
}
}})

My out

{'took': 17,
 'timed_out': False,
 '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
 'hits': {'total': {'value': 2, 'relation': 'eq'},
  'max_score': 2.345461,
  'hits': [{'_index': 'players',
    '_type': '_doc',
    '_id': '3',
    '_score': 2.345461,
    '_source': {'id': 3,
     'name': 'sachin',
     'description': 'was',
     'type': 'cricket'}},
   {'_index': 'players',
    '_type': '_doc',
    '_id': '1',
    '_score': 2.0,
    '_source': {'id': 1,
     'name': 'christiano ronaldo',
     'description': '[email protected]',
     'type': 'football'}}]}}

Expected out is below

{'took': 2,
 'timed_out': False,
 '_shards': {'total': 1, 'successful': 1, 'skipped': 0, 'failed': 0},
 'hits': {'total': {'value': 2, 'relation': 'eq'},
  'max_score': 2.0,
  'hits': [{'_index': 'players',
    '_type': '_doc',
    '_id': '1',
    '_score': 2.0,
    '_source': {'id': 1,
     'name': 'christiano ronaldo',
     'description': '[email protected]',
     'type': 'football'}},
   {'_index': 'players',
    '_type': '_doc',
    '_id': '2',
    '_score': 2.0,
    '_source': {'id': 2,
     'name': 'lionel messi',
     'description': '[email protected]',
     'type': 'soccer'}}]}}

CodePudding user response:

Basically you need to define both synonym and stop words, instead of defining it in a file, you can also pass the synonym in the settings itself that way changing it will be much easy, you just need to close the index, update the synonym list and again open the index.

Also, if you are using english content, Elasticsearch also have a default stop worlds list, which has was already, you can take a look at all the default stop words here

Now use below setting and mapping to create your index

{
    "settings": {
        "index": {
            "analysis": {
                "filter": {
                    "synonym_en": {
                        "type": "synonym_graph",
                        "synonyms": [
                            "football, soccer"
                        ]
                    },
                    "english_stop": {
                        "type": "stop",
                        "stopwords": "_english_"
                    }
                },
                "analyzer": {
                    "english_analyzer": {
                        "tokenizer": "standard",
                        "filter": [
                            "lowercase",
                            "english_stop",
                            "synonym_en"
                        ]
                    }
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "name": {
                "type": "text",
                "analyzer": "english_analyzer"
            },
            "description": {
                "type": "text",
                "analyzer": "english_analyzer"
            }
        }
    }
}

And after that use same query, now you should be able to get your expected results.

CodePudding user response:

You can use the below index mapping, to include multiple files in stopwords_path, and then use the custom analyzer on both name and description fields.

{
    "settings": {
        "analysis": {
            "analyzer": {
                "stop-analyzer": {
                    "tokenizer": "whitespace",
                    "filter": [
                        "stop_words_1",
                        "stop_words_2"
                    ]
                }
            },
            "filter": {
                "stop_words_1": {
                    "type": "stop",
                    "stopwords_path": "stopwords.txt"
                },
                "stop_words_2": {
                    "type": "stop",
                    "stopwords_path": "synonym.txt"
                }
            }
        }
    },
    "mappings": {
        "properties": {
            "name": {
                "type": "text",
                "analyzer": "stop-analyzer"
            },
            "description": {
                "type": "text",
                "analyzer": "stop-analyzer"
            }
        }
    }
}
  • Related