I'm unable to find any references other than this link that confirms that the failure has to be consecutive. https://github.com/kubernetes/website/issues/37414
Background: Our Java application is getting restarted every day because of liveness probe failure. The application's access logs don't show 3 consecutive failures. So wanted to understand the behavior of probes.
CodePudding user response:
Liveness check is created when Kubernetes creates pod and is recreated each time that Pod is restarted. In your configuration you have set initialDelaySeconds: 20 so after creating a pod, Kubernetes will wait 20 seconds, then it will call liveness probe 3 times (as default value failureThreshold: 3). After 3 fails, Kubernetes will restart this pod according to RestartPolicy. Also in logs you will be able to find in logs.
When you are using kubectl get events
you are getting events only from the last hour.
Kubectl get events
LAST SEEN TYPE REASON OBJECT
47m Normal Starting node/kubeadm
43m Normal Scheduled pod/liveness-http
43m Normal Pulling pod/liveness-http
43m Normal Pulled pod/liveness-http
43m Normal Created pod/liveness-http
43m Normal Started pod/liveness-http
4m41s Warning Unhealthy pod/liveness-http
40m Warning Unhealthy pod/liveness-http
12m20s Warning BackOff pod/liveness-http
same command after ~1 hour:
LAST SEEN TYPE REASON OBJECT
43s Normal Pulling pod/liveness-http
8m40s Warning Unhealthy pod/liveness-http
20m Warning BackOff pod/liveness-http
So that might be the reason you are seeing only one failure.
Liveness probe can be configured using the fields below:
initialDelaySeconds: Number of seconds after the container has started before liveness or readiness probes are initiated. Defaults to 0 seconds. Minimum value is 0.
periodSeconds: How often (in seconds) to perform the probe. Default to 10 seconds. Minimum value is 1.
timeoutSeconds: Number of seconds after which the probe times out. Defaults to 1 second. Minimum value is 1.
successThreshold: Minimum consecutive successes for the probe to be considered successful after having failed. Defaults to 1. Must be 1 for liveness. Minimum value is 1.
failureThreshold: When a probe fails, Kubernetes will try failureThreshold times before giving up. Giving up in case of liveness probe means restarting the container. In case of a readiness probe the Pod will be marked Unready. Defaults to 3. Minimum value is 1.
If you set the minimal values for periodSeconds, timeoutSeconds, successThreshold and failureThreshold you can expect more frequent checks and faster restarts.
Liveness probe :
- Kubernetes will restart a container in a pod after failureThreshold times. By default it is 3 times - so after 3 failed probes.
- Depending on your configuration of the container, time needed for container termination could be very differential
- You can adjust both failureThreshold and terminationGracePeriodSeconds period parameters, so the container will be restarted immediately after every failed probe
In liveness probe configuration and best practices you can find more information.
CodePudding user response:
Yes the probes have to be consecutive, according to the api docs:
Minimum consecutive failures for the probe to be considered failed after having succeeded. Defaults to 3. Minimum value is 1.