I am using Elasticsearch version elasticsearch-7.13.4. I don't find any precise answer to it on the internet. Is there any formula for it?
CodePudding user response:
The underlying concept is called "sliced scroll" and is independent of the number of shards. From the linked documentation:
By default the splitting is done on the shards first and then locally on each shard using the _id field with the following formula: slice(doc) = floorMod(hashCode(doc._id), max) For instance if the number of shards is equal to 2 and the user requested 4 slices then the slices 0 and 2 are assigned to the first shard and the slices 1 and 3 are assigned to the second shard.
However, when using automatic slicing (i.e. slices=auto
), Elasticsearch will pick the number of shards as the number of slices
Setting slices to auto will let Elasticsearch choose the number of slices to use. This setting will use one slice per shard, up to a certain limit. If there are multiple sources, it will choose the number of slices based on the index or backing index with the smallest number of shards.
Query performance is most efficient when the number of slices is equal to the number of shards in the index. [...] Setting slices higher than the number of shards generally does not improve efficiency and adds overhead.