ElasticSearch Downtime From Closing and Opening Index-CodePudding

I'm managing an ElasticSearch cluster and I need to add an analyzer to one of my indices. The particular index I want to update is a bit more than 3TB. Will there be an excessive amount of downtime associated with closing and reopening this large of an index to add the analyzer? The documentation doesn't seem to say anything about the processing required to close and open an index.

I have done many rolling restarts and the shard recovery is pretty quick, but I'm guessing that closing and opening an index cannot be done one node at a time with a rolling restart.

CodePudding user response：

As per the official document of open index API

When opening or closing an index, the master is responsible for restarting the index shards to reflect the new state of the index. The shards will then go through the normal recovery process. The data of opened/closed indices is automatically replicated by the cluster to ensure that enough shard copies are safely kept around at all times.

This clearly explains that its not a cheap operation, and if you have many shards in your cluster and your cluster state is big, updating that to all the nodes can cause significant overhead.

Apart from this, opening and closing an index also allocates the shards, again explained in the same document section of wait for active shards

Because opening or closing an index allocates its shards, the wait_for_active_shards setting on index creation applies to the _open and _close index actions as well.

And this one is a major overhead as it involves moving the data ie shards in the cluster and your is a very index, so it can cause huge data movement is your cluster.

Hope this helps.