I have an index with field content
, here is a mapping:
{
"properties": {
"content": {
"type": "text",
"analyzer": "english"
}
}
}
And I have a simple search query
curl -X GET 'localhost:9200/idx/_search' -H 'content-type: application/json' -d '{
"query": {
"match": {
"content": "yellow fox"
}
},
"fields": [
"content"
]
}'
{
...
"hits" : {
"hits" : [
{
...
"fields" : {
"content" : [
"Yellow foxes jump"
]
},
}
...
}
How can I modify my search query to also receive content terms like analyze API provides:
curl -X GET 127.0.0.1:9200/_analyze -H 'content-type: application/json' -d '{
"analyzer" : "english",
"text" : "yellow foxes"
}'
{
"tokens" : [
{
"end_offset" : 6,
"position" : 0,
"start_offset" : 0,
"token" : "yellow",
"type" : "<ALPHANUM>"
},
{
"end_offset" : 12,
"position" : 1,
"start_offset" : 7,
"token" : "fox",
"type" : "<ALPHANUM>"
}
]
}
Generally, desired output of search query look like this
{
...
"hits" : {
"hits" : [
{
...
"fields" : {
"content" : [
"Yellow foxes jump"
],
"content_terms": [
"yellow", "fox", "jump"
]
},
}
...
}
CodePudding user response:
You don't have to do anything special to search terms - as you are already doing it. When you give a sentence inside a match query - the sentence itself is tokenized using the same analyzer it was indexed.
which means, if the query is "quick brown fox" - its searching for "quick" "brown" "fox" ; In case of phrase query - ES also will check if all terms are in proximity.