I would like to order ElasticSearch query results based on the percentage of matches for a nested field.
For example, let's suppose I have an ElasticSearch index strucutured as follows:
{
"properties": {
"name": {
"type": "text"
},
"jobs": {
"type": "nested",
"properties": {
"id": {
"type": "long"
}
}
}
}
}
With the following documents:
{
"name": "Alice",
"jobs": [
{ "id": 1 },
{ "id": 2 },
{ "id": 3 },
{ "id": 4 }
]
}
{
"name": "Bob",
"jobs": [
{ "id": 1 },
{ "id": 2 },
{ "id": 3 }
]
}
{
"name": "Charles",
"jobs": [
{ "id": 2 },
{ "id": 3 }
]
}
Now, I would like to perform a query to find which documents have specific jobs, ordered by the percentage of matched jobs. For example:
- Searching for jobs
1
and2
, I would expect the order to be:- Bob (66% jobs matched)
- Alice (50% jobs matched)
- Charles (50% jobs matched)
- Searching for jobs
2
, I would expect the order to be:- Charles (50% jobs matched)
- Bob (33% jobs matched)
- Alice (25% jobs matched)
So far, I'm using the following query, but it sorts by number of matches, not the percentage:
{
"query": {
"nested": {
"path": "jobs",
"query": {
"bool": {
"should": [
{
"match": {
"jobs.id": "1"
}
},
{
"match": {
"jobs.id": "2"
}
}
]
}
},
"score_mode":"sum"
}
}
}
CodePudding user response:
script_score seems to do the job:
{
"query": {
"function_score": {
"query": {
"nested": {
"path": "jobs",
"query": {
"bool": {
"should": [
{
"match": {
"jobs.id": "1"
}
},
{
"match": {
"jobs.id": "2"
}
}
]
}
},
"score_mode": "sum"
}
},
"script_score": {
"script": {
"source": "_score / params['_source']['jobs'].length"
}
}
}
}
}