I am matching the phone numbers and ssn that start with '40'. For ssn I am getting the correct matching count. For the phone number, I am not getting the correct matching count, as the phone number contains hyphens '-' in between the phone number. Example: '403-517-2323'.
When I search for a phone number that starts with '40' it includes the phone number that has '40' in between after the hyphen. Example: '222-401-8120' and '823-093-4012'.
How can I exclude matching in between and match only at the start of the phone number?
Below is the query I am trying>
GET emp_details_1_1/_msearch
{"index": "emp_details_1_1"}
{"_source":[],"size":0,"min_score":1,"query":{"multi_match":{"query":"40","fields":["ssn"],"type":"phrase_prefix"}}}
{"index": "emp_details_1_1"}
{"_source":[],"size":0,"min_score":1,"query":{"multi_match":{"query":"21","fields":["phone"],"type":"phrase_prefix"}}}
CodePudding user response:
As I don't have the exact index mapping and settings, I am guessing you are using the analyzer that is breaking your phone number on _
, if its default analyzer(standard) then it does the same thing.
POST _analyze
{
"text" : "403-517-2323",
"analyzer": "standard"
}
Tokens generated
{
"tokens": [
{
"token": "403",
"start_offset": 0,
"end_offset": 3,
"type": "<NUM>",
"position": 0
},
{
"token": "517",
"start_offset": 4,
"end_offset": 7,
"type": "<NUM>",
"position": 1
},
{
"token": "2323",
"start_offset": 8,
"end_offset": 12,
"type": "<NUM>",
"position": 2
}
]
}
In order to fix that, you should use the prefix query on .keyword
subfield field(if its generated in your mapping) or create a field of type keyword
that stores phone number in your index.