I am using spark 3.0.3. I trigger the spark job which uses kubernetes as its resource manager. Driver pod does not get deleted it just sits there with completed state. I want to clean up this driver pod as well once job is complete.
CodePudding user response:
To auto remove the Job pod once it's completed you can use the history : https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/
The .spec.successfulJobsHistoryLimit and .spec.failedJobsHistoryLimit fields are optional. These fields specify how many completed and failed jobs should be kept. By default, they are set to 3 and 1 respectively. Setting a limit to 0 corresponds to keeping none of the corresponding kind of jobs after they finish.
example
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: hello
spec:
schedule: "*/1 * * * *"
successfulJobsHistoryLimit: 0
failedJobsHistoryLimit: 0
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: busybox
args:
- /bin/sh
- -c
- date; echo Hello from the Kubernetes cluster
restartPolicy: OnFailure
https://kubernetes.io/docs/tasks/job/automated-tasks-with-cron-jobs/#jobs-history-limits
CodePudding user response:
Based on this official documentation you can also use TTL mechanism for finished Jobs. Bare in mind, that this feature is available since Kubernetes 1.21 and at the time of writing the answer is in beta state:
Another way to clean up finished Jobs (either
Complete
orFailed
) automatically is to use a TTL mechanism provided by a TTL controller for finished resources, by specifying the.spec.ttlSecondsAfterFinished
field of the Job.When the TTL controller cleans up the Job, it will delete the Job cascadingly, i.e. delete its dependent objects, such as Pods, together with the Job. Note that when the Job is deleted, its lifecycle guarantees, such as finalizers, will be honored.
For example:
apiVersion: batch/v1
kind: Job
metadata:
name: pi-with-ttl
spec:
ttlSecondsAfterFinished: 100
template:
spec:
containers:
- name: pi
image: perl
command: ["perl", "-Mbignum=bpi", "-wle", "print bpi(2000)"]
restartPolicy: Never
The Job
pi-with-ttl
will be eligible to be automatically deleted,100
seconds after it finishes.If the field is set to
0
, the Job will be eligible to be automatically deleted immediately after it finishes. If the field is unset, this Job won't be cleaned up by the TTL controller after it finishes.
You can also find this similar question. You can find there good answer provided by the whites11.
See also this [github page] about k8s-job-cleaner. It could be another solution to your problem.