I have an ES index with a property that contains a list of nested objects with {"name": "string"}
shape.
I need to query documents that have at least a certain number of objects matching a given list of names.
But setting mimimum_should_match
to a value greater than 1 returns no documents contrary to what is expected.
Reproduction :
- Create
test
index
PUT /test
{
"mappings": {
"properties": {
"skills": {
"properties": {
"name": {
"type": "keyword"
}
},
"type": "nested"
}
}
}
}
- Add a few documents
POST /test/_bulk
{ "index" : { "_index" : "test" } }
{ "skills" : [{"name": "python"}, {"name": "css"}, {"name": "java"}] }
{ "index" : { "_index" : "test" } }
{ "skills" : [{"name": "python"}] }
{ "index" : { "_index" : "test" } }
{ "skills" : [{"name": "python"}, {"name": "css"}, {"name": "html"}] }
{ "index" : { "_index" : "test" } }
{ "skills" : [{"name": "python"}, {"name": "css"}, {"name": "java"}, {"name": "photoshop"}, {"name": "js"}] }
{ "index" : { "_index" : "test" } }
{ "skills" : [{"name": "python"}, {"name": "git"}] }
{ "index" : { "_index" : "test" } }
{ "skills" : [{"name": "python"}, {"name": "css"}, {"name": "java"}, {"name": "react"}] }
I would like to return documents that have at least two of the skills ["python", "css", "java"]
.
The query below returns no documents.
{
"highlight": {
"fields": {
"*": {}
}
},
"query": {
"nested": {
"path": "skills",
"query": {
"bool": {
"minimum_should_match": 2,
"should": [
{
"match": {
"skills.name": "python"
}
},
{
"match": {
"skills.name": "css"
}
},
{
"match": {
"skills.name": "java"
}
}
]
}
}
}
}
}
The same query with "minimum_should_match":1
returns as expected all 6 documents. As you can see in the highlights more that a single skill is matched.
POST /test/_search | jq ".hits.hits[].highlight"
{
"skills.name": [ <=========== id of this document is aRN7uoUBoT40NEdPkl68
"<em>python</em>",
"<em>css</em>",
"<em>java</em>"
]
}
{
"skills.name": [
"<em>python</em>",
"<em>css</em>",
"<em>java</em>"
]
}
{
"skills.name": [
"<em>python</em>",
"<em>css</em>",
"<em>java</em>"
]
}
{
"skills.name": [
"<em>python</em>",
"<em>css</em>"
]
}
{
"skills.name": [
"<em>python</em>"
]
}
{
"skills.name": [
"<em>python</em>"
]
}
If I try to explain why the first document is not matched with the query having minimum_should_match=2
this is the output
{
"_id": "aRN7uoUBoT40NEdPkl68",
"_index": "test",
"_type": "_doc",
"explanation": {
"description": "Not a match",
"details": [],
"value": 0.0
},
"matched": false
}
The same behavior if I switch match
to term
.
CodePudding user response:
I get 4 results when minimal match is 2. I had change the query to match separately each skill name.
{
"highlight": {
"fields": {
"*": {}
}
},
"query": {
"bool": {
"minimum_should_match": 2,
"should": [
{
"nested": {
"path": "skills",
"query": {
"match": {
"skills.name": "python"
}
}
}
},
{
"nested": {
"path": "skills",
"query": {
"match": {
"skills.name": "css"
}
}
}
},
{
"nested": {
"path": "skills",
"query": {
"match": {
"skills.name": "java"
}
}
}
}
]
}
}
}