Home > Net >  Does the document count matter when consider shard size?
Does the document count matter when consider shard size?

Time:07-27

I am reading this doc https://www.elastic.co/guide/en/elasticsearch/reference/7.17/size-your-shards.html to decide how many shards I need.

It mentioned some factors like data size per shard, node heap memory size etc. For example, it says generally try to keep one shard size between 10G to 50G, but it doesn't mention document count.

I have some data which is very small individually but has a large number. It takes 5 GB storage for 10 million documents. In this case, do I use 1 shard?

The query executes on a single thread per shard, to search 10 million documents in one thread is probably not a good idea.

How should I size the shard for a large count of small document in this case?

CodePudding user response:

there is a per shard limit at the lucene level of (2^32)-1 as per Elasticsearch and Lucene document limit

and while the recommended shard size is <50 gig, you can have smaller indices if that's all the data you have. the big, and unsaid thing in that suggestion is that you shouldn't have a tonne of really small indices. eg thousands of indices with 1 shard and a small amount of documents. you are better off merging them (if you can), as ultimately the bulk of the resource use is based on the number of shards, not documents

for what you want, use 1 primary shard

  • Related