Home > Software design >  Elasticsearch partial matching on postal address and customer number
Elasticsearch partial matching on postal address and customer number

Time:12-21

I am trying to partially match a search term to the given schema for autocomplete. I'd like customerNumber and AddressLine1 and Zip to match any document that starts with 419 (so 4191 should match customer number 41915678 and address 4191 Board Street and zip code 41912)

"mappings": {
    "companyName": {
        "type": "text"
    },
    "customerNumber": {
        "type": "long"
    }
    "address": {
        "addressLine1": {
            "type": "text"
        },
        "city": {
            "type": "text"
        },
        "state": {
            "type": "text"
        },
        "zip": {
            "type": "text"
        }
    }
}

Anyone got a neat solution to the query? Eventually I would need to convert this query to C# using NEST client.

CodePudding user response:

One easy way to do it is to leverage the completion suggester field type.

Basically, you can modify your mapping by adding a completion field in your mapping, such as

  "suggest": {
    "type": "completion"
  },

However, the completion field's default analyzer (i.e. the simple analyzer) doesn't index numbers, we need to create our custom analyzer that will properly do it:

PUT my-index
{
  "settings": {
    "analysis": {
      "analyzer": {
        "suggest_analyzer": {         <--- custom analyzer
          "type": "custom",
          "tokenizer": "classic",
          "filter": [
            "lowercase"
          ]
        }
      }
    }
  },
  "mappings": {
    "properties": {
      ...,
      "suggest": {                    <--- the new completion field with the right analyzer
        "type": "completion",
        "analyzer": "suggest_analyzer"
      }
    }
  }
}

And then you simply need to populate your index by adding all the values you want suggestions on in the suggest field, like below:

PUT my-index/_doc/1
{
  "address": {
    "addressLine1": "1234 Main Street",
    "zip": "34526"
  },
  "customerNumber": "41915678",
  "suggest": [
    "1234 Main Street",
    "34526",
    "41915678"
  ]
}
PUT my-index/_doc/2
{
  "address": {
    "addressLine1": "4191 Board Street",
    "zip": "45263"
  },
  "customerNumber": "45267742",
  "suggest": [
    "4191 Board Street",
    "45263",
    "45267742"
  ]
}
PUT my-index/_doc/3
{
  "address": {
    "addressLine1": "5662 4th Avenue",
    "zip": "41912"
  },
  "customerNumber": "24442561",
  "suggest": [
    "5662 4th Avenue",
    "41912",
    "24442561"
  ]
}

Then, you can search for 419 using the following suggest query:

POST my-index/_search
{
  "suggest": {
    "customer-suggest": {
      "prefix": "419",
      "completion": {
        "field": "suggest"
      }
    }
  }
}

And you will get all three documents, because each one will have one field that matches 419

  • Related