optimization on old indexes collecting logs from my apps-CodePudding

I have an elastic cluster with 3x nodes(each 6x cpu, 31GB heap , 64GB RAM) collecting 25GB logs per day , but after 3x months I realized my dashboards become very slow when checking stats in past weeks , please, advice if there is an option to improve the indexes read erformance so it become faster when calculating my dashboard stats?

Thanks!

CodePudding user response：

I would suggest you try to increase the shards number when you have more shards Elasticsearch will split your data over the shards so as a result, Elastic will send multiple parallel requests to search in a smaller data stack

for Shards number you could try to split it based on your heap memory size
No matter what actual JVM heap size you have, the upper bound on the maximum shard count should be 20 shards per 1 GB of heap configured on the server.

ElasticSearch - Optimal number of Shards per node https://qbox.io/blog/optimizing-elasticsearch-how-many-shards-per-index https://opster.com/elasticsearch-glossary/elasticsearch-choose-number-of-shards/

CodePudding user response：

It seems that the amount of data that you accumulated and use for your dashboard is causing performance problems.

A straightforward option is to increase your cluster's resources but then you're bound to hit the same problem again. So you should rather rethink your data retention policy.

Chances are that you are really only interested in most recent data. You need to answer the question what "recent" means in your use case and simply discard anything older than that.

Elasticsearch has tools to automate this, look into Index Lifecycle Management.

What you probably need is to create an index template and apply a lifecycle policy to it. Elasticsearch will then handle automatic rollover of indices, eviction of old data, even migration through data tiers in hot-warm-cold architecture if you really want very long retention periods.

All this will lead to a more predictable performance of your cluster.