I have a bunch of documents that look like this in my index:
{
"given_name":"John",
"family_name":"Smith",
"email_addresses": [
{
"email_address":"[email protected]",
"primary":true
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"[email protected]",
"primary":false
}
]
}
The mapping looks like this:
{
"mappings":{
"properties":{
"given_name":{
"type":"keyword",
"fields":{
"search":{
"type":"search_as_you_type"
}
}
},
"family_name":{
"type":"keyword",
"fields":{
"search":{
"type":"search_as_you_type"
}
}
},
"email_addresses":{
"type":"nested",
"properties":{
"email_address":{
"type":"keyword",
"fields":{
"search":{
"type":"search_as_you_type"
}
}
},
"primary":{
"type":"boolean"
}
}
}
}
}
}
I am running a prefix search on given_name
, family_name
and email_addresses
. This will allow the user to start typing and relevant results from those fields should start returning:
{
"query":{
"bool":{
"should":[
{
"nested":{
"path":"email_addresses",
"query":{
"prefix":{
"email_addresses.email_address.search": {
"value":"j"
}
}
}
}
},
{
"multi_match":{
"query":"j",
"fields":[
"given_name.search",
"family_name.search"
],
"type": "bool_prefix"
}
}
]
}
}
}
I'd like to sort the results from the above by the best matching email_address
in email_addresses
if there is one or more matching email_address
under email_addresses
, otherwise to use the email_address
under email_addresses
where primary
is true
.
I have looked into a script for sorting, but I didn't find anyway to access the matched nested child in a script in the documentation.
Is there anyway to achieve this?
CodePudding user response:
To do this, we can use a bool
query in the nested sort.
Given we have the following 4 documents:
{
"given_name":"John",
"family_name":"Smith1",
"email_addresses": [
{
"email_address":"[email protected]",
"primary":true
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"someguy53gmail.com",
"primary":false
}
]
}
{
"given_name":"John",
"family_name":"Smith2",
"email_addresses": [
{
"email_address":"[email protected]",
"primary":true
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"someguy56gmail.com",
"primary":false
}
]
}
{
"given_name":"John",
"family_name":"Smith3",
"email_addresses": [
{
"email_address":"[email protected]",
"primary":true
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"someguy46gmail.com",
"primary":false
}
]
}
{
"given_name":"John",
"family_name":"Smith4",
"email_addresses": [
{
"email_address":"[email protected]",
"primary":true
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"[email protected]",
"primary":false
},
{
"email_address":"someguy42gmail.com",
"primary":false
}
]
}
We can write our query like so:
{
"query":{
"bool":{
"should":[
{
"nested":{
"path":"email_addresses",
"query":{
"prefix":{
"email_addresses.email_address.search":{
"value":"john"
}
}
}
}
},
{
"multi_match":{
"query":"john",
"fields":[
"given_name.search",
"family_name.search"
],
"type":"bool_prefix"
}
}
]
}
},
"sort":[
{
"email_addresses.email_address":{
"order" : "asc",
"nested":{
"path":"email_addresses",
"filter":{
"bool":{
"should":[
{
"prefix":{
"email_addresses.email_address.search":{
"value":"john"
}
}
},
{
"term":{
"email_addresses.primary": true
}
}
]
}
}
}
}
}
]
}
First we do a prefix search on the email_addresses.email_address
, given_name
and family_name
.
Then we sort on the nested email_addresses
field as follows:
- Sort by the
email_addresses.email_address
that matches our query. - Sort by
email_address.primary = true
.
The way this works is that in the bool query, Elasticsearch will first find documents that matches the first query under should
and sort those documents. For the remaining documents that do not match, it will proceed to the next query, which in our case is email_address.primary = true
. If there are more documents that do not match either of these queries, they will be ordered using an order predetermined by Elasticsearch.