I am trying to create a Kubernetes Job with a backofflimit
value set to 4
. So I would want to know that if there's a Pod in the Job that is failed, I would want to wait for n minutes or n seconds before the second pod is created. Is there a way how I can do it?
apiVersion: batch/v1
kind: Job
metadata:
name: pi-test
spec:
template:
spec:
containers:
- name: pi
image: bitnami/git:latest
command: ["/bin/bash", "-c", "gits clone ls -b master"]
restartPolicy: OnFailure
backoffLimit: 4
activeDeadlineSeconds: 120
CodePudding user response:
You can achieve this by using liveness probe
Example Yaml #you can change as per your need.
apiVersion: batch/v1
kind: Job
metadata:
name: pi-test
spec:
template:
spec:
containers:
- name: pi
image: bitnami/git:latest
args:
- /bin/sh
- -c
- touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
livenessProbe: # Check if healthy
exec:
command:
- cat
- /tmp/healthy
initialDelaySeconds: 5
periodSeconds: 5
command: ["/bin/bash", "-c", "gits clone ls -b master"]
restartPolicy: OnFailure
backoffLimit: 4
activeDeadlineSeconds: 120
Note for initialDelaySeconds
: If you don’t set the initial delay, the prober will start probing the container as soon as it starts, which usually leads to the probe failing, because the app isn’t ready to start receiving requests. If the number of failures exceeds the failure threshold, the con- tainer is restarted before it’s even able to start responding to requests properly.
Note for Pod backoff failure policy: Failed Pods associated with the Job are recreated by the Job controller with an exponential back-off delay (10s, 20s, 40s ...) capped at six minutes. Ref: