There are a lot of articles online about running an Elasticsearch multi-node cluster using docker-compose, including the official documentation for Elasticsearch 8.0. However, I cannot find a reason why you would set up multiple nodes on the same docker host. Is this the recommended setup for a production environment? Or is it an example of theory in practice?
CodePudding user response:
You shouldn't consider this a production environment. The guides are examples, often for lab environments, and testing scenarios with the application. I would not consider them production ready, and compose is often not considered a production grade tool since everything it does is to a single docker node, where in production you typically want multiple nodes spread across multiple availability zones.
CodePudding user response:
Since one ES node heap memory should never get more than half the available memory (and less than ~30.5GB), one reason it makes sense to have several nodes on a given host is when you have hosts with ample memory (say 128GB ). In that case you could run 2 ES nodes (with 64GB of memory each, 30.5GB heap and the rest for Lucene) on the same host by correctly constraining each Docker container.
Note that the above is not related to Docker, you can always configure several nodes per host, whether Docker or not.
Regarding production and given the fact that 2 nodes would run on the same host, if you lose that host, you lose two nodes, which is not good. However, depending on how many hosts you have, it might be a lesser problem, if and only if, each host is in a different availability zone and you have the appropriate cluster/shard allocation awareness settings configured, which would ensure that your data is redundantly copied in 2 availability zones. In this case, losing a host (2 nodes) would still keep your cluster running, although in degraded mode.