Home > Software engineering >  What is data structure used for Elasticsearch flattened type
What is data structure used for Elasticsearch flattened type

Time:09-12

I was trying to find how flattened type in Elasticsearch works under the hood, the documentation specifies that all leaf values will be indexed into a single field as a keyword, as a result, there will be a dedicated index for all those flattened keywords.

From documentation:

By default, Elasticsearch indexes all data in every field and each indexed field has a dedicated, optimized data structure. For example, text fields are stored in inverted indices, and numeric and geo fields are stored in BKD trees.

The specific case that I am trying to understand:

If I have flattened field and index object with nested objects there is the ability to query a specific nested key in the flattened object. See how to query by labels.release:

PUT bug_reports
{
  "mappings": {
    "properties": {
      "labels": {
        "type": "flattened"
      }
    }
  }
}

POST bug_reports/_doc/1
{
  "labels": {
    "priority": "urgent",
    "release": ["v1.2.5", "v1.3.0"]
  }
}
    
POST bug_reports/_search
{
  "query": {
    "term": {"labels.release": "v1.3.0"}
  }
}

Would flattened field have the same index structure as the keyword field, and how it is able to reference the specific child key of flattened object?

CodePudding user response:

The initial design and implementation of the flattened field type is described in this issue. The leaf keys are also indexed along with the leaf values, which is how they are allowing the search for a specific sub-field.

There are some ongoing improvements to the flattened field type and Elastic would also like to support numeric values, but that's not yet released.

  • Related