Fault Tolerance and Kubernetes StatefulSet-CodePudding

As I understand it, most databases enable the use of replicas that can take over from a leader in case the leader is unavailable.

I'm wondering the necessity of having these replicas in a Kubernetes environment, when using say a StatefulSet. Once the pod becomes unresponsive, Kubernetes will restart it, right? And the PVC will make sure the data isn't lost.

Is it that leader election is a faster process than bringing up a new application?

Or is it that the only advantage of the replicas is to provide load balancing for read queries?

CodePudding user response：

As I understand it, most databases enable the use of replicas that can take over from a leader in case the leader is unavailable.

I'm wondering the necessity of having these replicas in a Kubernetes environment, when using say a StatefulSet.

There has been a move to distributed databases from previous single node datatbases. Distributed databases typically run using 3 or 5 replicas / instances in a cluster. The primary purpose for this is High Availability and fault tolerance to e.g. node or disk failure. This is the same if the database is run on Kubernetes.

the PVC will make sure the data isn't lost.

The purpose of PVCs is to decouple the application configuration with the selection of storage system. This allows that you e.g. can deploy the same application on both Google Cloud, AWS and Minikube without any different configuration although you will use different storage systems. This does not change how the storage systems work.

Is it that leader election is a faster process than bringing up a new application?

Many different things can fail, the node, the storage system or the network can be partitioned so that you cannot reach a certain node.

Leader election is just a piece of the mitigations against these problems in a clustered setup, you also need replication of all data in a consistent way. Raft consensus algorithm is a common solution for this in modern distributed databases.

Or is it that the only advantage of the replicas is to provide load balancing for read queries?

This might be an advantage in distributed databases, yes. But this is seldom the primary reason to using them, in my experience.