Home > Enterprise >  prioritizing keywords over overs elastic search
prioritizing keywords over overs elastic search

Time:06-18

If I have an elastic search for patent data like:

"query": {
            "bool": {
                "must": [
                    {
                        "bool": {
                            "should": [{"match": {
                                             {"claim": "SOME KEYWORD IN THE CLAUIMS"},
                                             {"claim": "SOME LESS IMPORTANT KEYWORD IN THE CLAIMS"}}]
                        }
                    },
                    {
                        "range": {
                            "date_published": {
                                "gte": start_year,
                                "lte": end_year
                            }
                        }

                    }
                ],
                "should": [],
                "filter": [{
                    "exists": {
                        "field": "title"}}]
            }
        },
        "size": str(size),
        "from": str(offset),

        'language': 'EN',
        "stemming": "true",
    }

how do I tell the search that one keyword is more important than another? The field I'm searching is a text field, so I don't think I can use "Terms" and "weights" (maybe I can, I'm an elastic noob).

Edit: the new query from suggestion looks like:

{
    'query':
        {
            'bool':
                {
                    'must':
                        [{
                             'bool': {
                                 'should': []}},
                         {
                             'range': {
                                 'date_published': {
                                     'gte': '2019-01-01',
                                     'lte': None}}}],
                    'should': [{
                                   'match': {
                                       'title': 'AMBRA1',
                                       'boost': 2}}, {
                                   'match': {
                                       'abstract': 'AMBRA1',
                                       'boost': 2}}, {
                                   'match': {
                                       'claim': 'AMBRA1',
                                       'boost': 2}}, {
                                   'match': {
                                       'title': '',
                                       'boost': 2}}, {
                                   'match': {
                                       'abstract': '',
                                       'boost': 2}}, {
                                   'match': {
                                       'claim': '',
                                       'boost': 2}}],
                    'filter': [{
                                   'exists': {
                                       'field': 'title'}}]}},
    'size': '10',
    'from': '0',
    'language': 'EN',
    'stemming': 'true'}

and it returns a cannot parse query error

CodePudding user response:

Maybe you can use "boost" in your "match" clause. That way the clause with "boost" will have a higher score than the others.

 POST _search
   {
      "query": {
        "bool": {
          "must": [
            {
              "bool": {
                "should": []
              }
            },
            {
              "range": {
                "date_published": {
                  "gte": "2019-01-01"
                }
              }
            }
          ],
          "should": [
            {
              "match": {
                "title": {
                  "query": "AMBRA1",
                  "boost": 2
                }
              }
            },
            {
              "match": {
                "abstract": {
                  "query": "AMBRA1",
                  "boost": 2
                }
              }
            },
            {
              "match": {
                "claim": {
                  "query": "AMBRA1",
                  "boost": 2
                }
              }
            },
            {
              "match": {
                "title": {
                  "query": "",
                  "boost": 2
                }
              }
            },
            {
              "match": {
                "abstract": {
                  "query": "",
                  "boost": 2
                }
              }
            },
            {
              "match": {
                "claim": {
                  "query": "",
                  "boost": 2
                }
              }
            }
          ],
          "filter": [
            {
              "exists": {
                "field": "title"
              }
            }
          ]
        }
      },
      "size": "10",
      "from": "0"
    }
  • Related