Home > Software design >  Does non-indexed field update triggers reindexing in elasticsearch8?
Does non-indexed field update triggers reindexing in elasticsearch8?

Time:12-02

My index mapping is the following:

{
        "mappings": {
            "dynamic": False,
            "properties": {
                "query_str": {"type": "text", "index": False},
                "search_results": {
                    "type": "object", 
                    "enabled": False
                },
                "query_embedding": {
                    "type": "dense_vector",
                    "dims": 768,
                },
               
        }
    }

Field search_result is disabled. Actual search is performed only via query_embedding, other fields are just non-searchable data.

If I will update search_result field in existing document, will it trigger reindexing?

The docs say that "The enabled setting, which can be applied only to the top-level mapping definition and to object fields, causes Elasticsearch to skip parsing of the contents of the field entirely. The JSON can still be retrieved from the _source field, but it is not searchable or stored in any other way". So, it seems logical not to re-index docs if changes took place only in non-indexed part, but I'm not sure

CodePudding user response:

Elasticsearch documents (Lucene Segments) are inmutable, so every change you make in a document will delete the document and create a new one. This is a Lucene's behavior:

Lucene's index is composed of segments, each of which contains a subset of all the documents in the index, and is a complete searchable index in itself, over that subset. As documents are written to the index, new segments are created and flushed to directory storage. Segments are immutable; updates and deletions may only create new segments and do not modify existing ones. Over time, the writer merges groups of smaller segments into single larger ones in order to maintain an index that is efficient to search, and to reclaim dead space left behind by deleted (and updated) documents.

When you set enable:false you are just avoiding to have the field content in the searchable structures but the data still lives in Lucene.

You can see a similar answer here:

Partial update on field that is not indexed

  • Related