Home > Blockchain >  Search multiple fields and output summed score with Elasticsearch
Search multiple fields and output summed score with Elasticsearch

Time:10-06

I have multiple fields, eg. f1, f2, f3, that I want to search a single term against each and return the aggregated score where any field matches. I do not want to search each field by the same terms, only search a field by it's own term, eg. f1:t1, f2:t2, f3:t3.

Originally, I was using a must bool query with multi_match and the fields all concatenated as t1 t2 t3 and all fields searched, but the results aren't great. Using a dis_max query gets better results where I'm able to search the individual fields by their own term, but if for example t1 is found in f1 AND t2 in f2 the results from dis_max give back the highest resulting score. So if I have 3 documents with { "f1": "foo", "f2": "foo" }, { "f1": "foo", "f2": "bar" }, { "f1": "foo", "f2": "baz" } and I search for f1:foo and f2:ba I can still get back the first record with f2 of foo in the case where it was created most recently. What I'm trying to do is say that f1 matched foo so there's a score related to that, and f2 matched bar so the resultant score should be f1.score f2.score always bringing it up to the top because it matches both.

I'm finding that I could programmatically build a query that uses query_string, eg. (limiting to two fields for brevity)

GET /_search
{
  "query": {
    "query_string": {
      "query": "(f1:foo OR f1.autocomplete:foo) OR (f2:ba OR f2.autocomplete:ba)"
    }
  }
}

but I need to add a boost to the fields and this doesn't allow for that. I could also use a dis_max with a set of queries, but I'm really not sure how to aggregate score in that case.

Using better words, what I'm trying to search for is: if I have people data and I want to search for first name and last name, without searching first by last and last by first, a result that matches both first and last name should be higher than if it only returns one or the other.

Is there a better/good/proper way to achieve this using something? I feel like I've been over a lot of the query API and haven't found something that would be most good.

CodePudding user response:

You can use a simple should query

 minimum_should_match:1,
 "should" : [
        { "term" : { "f1" : "foo" } },
        { "term" : { "f2" : "ba" } }
      ]

more clause a document matches , more score it will have.

CodePudding user response:

Unable to edit the answer provided so posting the solution that was derived from the other answer here.

GET _search
{
  "query": {
    "bool": {
      "minimum_should_match": 1,
      "should": [
        {
          "match": {
            "f1": {
              "query": "foo",
              "boost": 1.5
            }
          }
        },
        {
          "match": {
            "f1.autocomplete": {
              "query": "sara",
              "boost": 1.5
            }
          }
        },
        {
          "match": {
            "f2": {
              "query": "ba",
              "boost": 1
            }
          }
        },
        {
          "match": {
            "f2.autocomplete": {
              "query": "ba",
              "boost": 1
            }
          }
        }
      ]
    }
  }
}

This gets me results that meet all of my criteria.

  • Related