I got the following document indexed in ES6:
{
"id": 1234,
...,
"images": [
{
"id": 1703805,
...,
"language_codes": [],
"ingest_source_ids": [123]
},
{
"id": 2481938,
...,
"language_codes": ["EN"],
"ingest_source_ids": [1,2,3]
}
]
}
The images
object is mapped as nested
.
I can find the document just fine using this query:
{
"query": {
"nested": {
"path": "images",
"query": {
"term": {
"images.ingest_source_ids": 123
}
}
}
}
}
But if I instead wanna find via languages_codes
I do not find document:
{
"query": {
"nested": {
"path": "images",
"query": {
"term": {
"images.language_codes": "EN"
}
}
}
}
}
ingest_source_ids
has been in the documents since day one. The language_codes
field has been added later. I do recall something about Elasticsearch doing some magic mapping with the initial documents, but on the other hand as far as I can read in the documentation, there's no special mapping needed for arrays - all fields can contain arrays as long as all keys are same type.
In this case it works fine with all keys being numeric in ingest_source_ids
, but language_codes
are also always strings, so should be same case.
What am I missing?
CodePudding user response:
If you have not explicitly defined any index mapping for language_codes
, then by default it will be indexed as :
"language_codes": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"ignore_above": 256
}
}
}
Considering that you are using the term
query, you must utilize this query on the keyword
type field in order for the query term to match the exact term documents.
Replace your query with:
{
"query": {
"nested": {
"path": "images",
"query": {
"term": {
"images.language_codes.keyword": "EN"
}
}
}
}
}