K8s jobs and pods differences as uses of host subdomain-CodePudding

I have K8s used by Helm 3.

I need to access a k8s job while running in yaml file (created by helm).

The kubectl version:

Client Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.6", GitCommit:"d921bc6d1810da51177fbd0ed61dc811c5228097", GitTreeState:"clean", BuildDate:"2021-10-27T17:50:34Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"} Server Version: version.Info{Major:"1", Minor:"21", GitVersion:"v1.21.6", GitCommit:"d921bc6d1810da51177fbd0ed61dc811c5228097", GitTreeState:"clean", BuildDate:"2021-10-27T17:44:26Z", GoVersion:"go1.16.9", Compiler:"gc", Platform:"linux/amd64"}

Helm version:

version.BuildInfo{Version:"v3.3.4", GitCommit:"a61ce5633af99708171414353ed49547cf05013d", GitTreeState:"clean", GoVersion:"go1.14.9"}

As the following link: DNS concept

It works fine for Pod, but not for job.

As explained, for putting hostname and subdomain in Pod's YAML file, and add service that holds the domain...

Need to check the state if running.

for pod, it is ready state.

kubectl wait pod/pod-name --for=condition=ready ...

For job there is no ready state (while pod behind is running).

How can I check the state of pod behind the job (job is running) and how can I use host subdomain for jobs?

My code ... (I removed some security tags, but the same. Important - It may be complicated.

I create a listener - running when listen, with job that need to do some curl command, and this can be achieved whether it has access to that pod behind the job):

Listener (the pod is the last job):

What I added is hostname and subdomain (which work for Pod, and not for Job). If it ever was on Pod - no problem.

I also realized that the name of the Pod (created by the job) has a hash automatic extension.

apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "my-project.fullname" . }}-listener
  namespace: {{ .Release.Namespace }}
  labels:
    name: {{ include "my-project.fullname" . }}-listener
    app: {{ include "my-project.fullname" . }}-listener
    component: {{ .Chart.Name }}
    subcomponent: {{ .Chart.Name }}-listener
  annotations:
    "prometheus.io/scrape": {{ .Values.prometheus.scrape | quote }}
    "prometheus.io/path": {{ .Values.prometheus.path }}
    "prometheus.io/port": {{ .Values.ports.api.container | quote }}
spec:
  template: #PodTemplateSpec (Core/V1)
    spec: #PodSpec (core/v1)
      hostname: {{ include "my-project.fullname" . }}-listener
      subdomain: {{ include "my-project.fullname" . }}-listener-dmn
      initContainers:
        # twice - can add in helers.tpl
        - name: wait-mysql-exist-pod
          image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
          imagePullPolicy: IfNotPresent
          env:
            - name: MYSQL_POD_NAME
              value: {{ .Release.Name }}-mysql
            - name: COMPONENT_NAME
              value: {{ .Values.global.mysql.database.name }}
          command:
            - /bin/sh
          args:
            - -c
            - |-
              while [ "$(kubectl get pod $MYSQL_POD_NAME 2>/dev/null | grep $MYSQL_POD_NAME | awk '{print $1;}')" \!= "$MYSQL_POD_NAME" ];do
                echo 'Waiting for mysql pod to be existed...';
                sleep 5;
              done
        - name: wait-mysql-ready
          image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
          imagePullPolicy: IfNotPresent
          env:
            - name: MYSQL_POD_NAME
              value: {{ .Release.Name }}-mysql
          command:
            - kubectl
          args:
            - wait
            - pod/$(MYSQL_POD_NAME)
            - --for=condition=ready
            - --timeout=120s
        - name: wait-mysql-has-db
          image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
          imagePullPolicy: IfNotPresent
          env:
            {{- include "k8s.db.env" . | nindent 12 }}
            - name: MYSQL_POD_NAME
              value: {{ .Release.Name }}-mysql
          command:
            - /bin/sh
          args:
            - -c
            - |-
             while [ "$(kubectl exec $MYSQL_POD_NAME -- mysql -uroot -p$MYSQL_ROOT_PASSWORD -e 'show databases' 2>/dev/null | grep $MYSQL_DATABASE | awk '{print $1;}')" \!= "$MYSQL_DATABASE" ]; do
                echo 'Waiting for mysql database up...';
                sleep 5;
             done
      containers:
        - name: {{ include "my-project.fullname" . }}-listener
          image:  {{ .Values.global.registry }}/{{ .Values.image.repository }}:{{ .Values.image.tag | default "latest" }}
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          env:
          {{- include "k8s.db.env" . | nindent 12 }}
            - name: SCHEDULER_DB
              value: $(CONNECTION_STRING)
          command: {{- toYaml .Values.image.entrypoint | nindent 12 }}
          args: # some args ...
          ports:
            - name: api
              containerPort: 8081
          resources:
            limits:
              cpu: 1
              memory: 1024Mi
            requests:
              cpu: 100m
              memory: 50Mi
          readinessProbe:
            httpGet:
              path: /api/scheduler/healthcheck
              port: api
              scheme: HTTP
            initialDelaySeconds: 10
            periodSeconds: 5
            timeoutSeconds: 1
          livenessProbe:
            tcpSocket:
              port: api
            initialDelaySeconds: 120
            periodSeconds: 10
            timeoutSeconds: 5
          volumeMounts:
            - name: {{ include "my-project.fullname" . }}-volume
              mountPath: /etc/test/scheduler.yaml
              subPath: scheduler.yaml
              readOnly: true
      volumes:
      - name: {{ include "my-project.fullname" . }}-volume
        configMap:
          name: {{ include "my-project.fullname" . }}-config
      restartPolicy: Never

The service (for the subdomain):

apiVersion: v1
kind: Service
metadata:
  name: {{ include "my-project.fullname" . }}-listener-dmn
spec:
  selector:
    name: {{ include "my-project.fullname" . }}-listener
  ports:
    - name: api
      port: 8081
      targetPort: 8081
  type: ClusterIP

Roles RoleBinding (to enable access for curl command):

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: {{ include "my-project.fullname" . }}-role
rules:
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods"]
  verbs: ["get", "watch", "list", "update"]
- apiGroups: [""] # "" indicates the core API group
  resources: ["pods/exec"]
  verbs: ["create", "delete", "deletecollection", "get", "list", "patch", "update", "watch"]
- apiGroups: ["", "app", "batch"] # "" indicates the core API group
  resources: ["jobs"]
  verbs: ["get", "watch", "list"]

Role-Binding:
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
  name: {{ include "go-scheduler.fullname" . }}-rolebinding
subjects:
- kind: ServiceAccount
  name: default
roleRef:
  kind: Role
  name: {{ include "go-scheduler.fullname" . }}-role
  apiGroup: rbac.authorization.k8s.io

And finally a tester that doing a curl command:

(For check I put tail -f), and enter the pod.

apiVersion: batch/v1
kind: Job
metadata:
  name: {{ include "my-project.fullname" . }}-test
  namespace: {{ .Release.Namespace }}
  labels:
    name: {{ include "my-project.fullname" . }}-test
    app: {{ include "my-project.fullname" . }}-test 
  annotations:
    "prometheus.io/scrape": {{ .Values.prometheus.scrape | quote }}
    "prometheus.io/path": {{ .Values.prometheus.path }}
    "prometheus.io/port": {{ .Values.ports.api.container | quote }}
spec:
  template: #PodTemplateSpec (Core/V1)
    spec: #PodSpec (core/v1)
      initContainers:
        # twice - can add in helers.tpl
        #
        - name: wait-sched-listener-exists
          image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
          imagePullPolicy: IfNotPresent
          env:
            - name: POD_NAME
              value: {{ include "my-project.fullname" . }}-listener
          command:
            - /bin/sh
          args:
            - -c
            - |-
              while [ "$(kubectl get job $POD_NAME 2>/dev/null | grep $POD_NAME | awk '{print $1;}')" \!= "$POD_NAME" ];do
                echo 'Waiting for scheduler pod to exist ...';
                sleep 5;
              done
        - name: wait-listener-running
          image: {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
          imagePullPolicy: IfNotPresent
          env:
            - name: POD_NAME
              value: {{ include "my-project.fullname" . }}-listener
          command:
            - /bin/sh
          args:
            - -c
            - |-
              while [ "$(kubectl get pods 2>/dev/null | grep $POD_NAME | awk '{print $3;}')" \!= "Running" ];do
                echo 'Waiting for scheduler pod to run ...';
                sleep 5;
              done
      containers:
        - name: {{ include "my-project.fullname" . }}-test
          image:  {{ .Values.global.registry }}/{{ .Values.global.k8s.image }}:{{ .Values.global.k8s.tag | default "latest" }}
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          command:
            - /bin/sh
          args:
            - -c
            - "tail -f"
     # instead of above can be curl: "curl -H 'Accept: application/json' -X get my-project-listener.my-project-listener-dmn:8081/api/scheduler/jobs"

      restartPolicy: Never

I enter the test pod

kubectl exec -it my-tester-<hash> -- /bin/sh

... and run the command:

ping my-project-listener.my-project-listener-dmn

Got:

ping: bad address 'my-project-listener.my-project-listener-dmn'

When doing that for pod:

PING pod-hostname.pod-subdomain (): ... data bytes

CodePudding user response：

There's a lot here, but I think you should be able to resolve all of this with a couple of small changes.

In summary, I'd suggest changing:

apiVersion: apps/v1
kind: Deployment     # <-- not a Job
metadata: &original-job-metadata-from-the-question
spec:
  template:
    metadata:
      labels:   # vvv matching the Service selector
        name: {{ include "my-project.fullname" . }}-listener
    spec:
      # delete all of the initContainers:
      containers: &original-container-list-from-the-question
      volumes: &original-volume-list-from-the-question
      # delete restartPolicy: (default value Always)

Delete the Role and RoleBinding objects; connect to the Service http://my-project-listener-dmn:8081 and not an individual Pod; and you can kubectl wait --for=condition=available on the Deployment.

Connect to Services, not individual Pods (or Jobs or Deployments). The Service is named {{ include "my-project.fullname" . }}-listener-dmn and that is the host name you should connect to. The Service acts as a very lightweight in-cluster load balancer, and will forward requests on to one of the pods identified by its selector.

So in this example you'd connect to the Service's name and port, http://my-project-listener-dmn:8081. Your application doesn't answer the very-low-level ICMP protocol and I'd avoid ping(1) in favor of a more useful diagnostic. Also consider setting the Service's port to the default HTTP port 80; it doesn't necessarily need to match the Pod's port.

The Service selector needs to match the Pod labels (and not the Job's or Deployment's labels). A Service attaches to Pods; a Job or a Deployment has a template to create Pods; and it's those labels that need to match up. You need to add labels to the Pod template:

spec:
  template:
    metadata:
      labels:
        name: {{ include "my-project.fullname" . }}-listener

Or, in a Helm chart where you have a helper to generate these labels,

      labels: {{- include "my-project.labels" | nindent 8 }}

The thing to check here is kubectl describe service my-project-listener-dmn. There should be a line at the bottom that says Endpoints: with some IP addresses (technically some individual Pod IP addresses, but you don't usually need to know that). If it says Endpoints: <none> that's usually a sign that the labels don't match up.

You probably want some level of automatic restarts. A Pod can fail for lots of reasons, including code bugs and network hiccups. If you set restartPolicy: Never then you'll have a Failed Pod, and requests to the Service will fail until you take manual intervention of some sort. I'd suggest setting this to at least restartPolicy: OnFailure, or (for a Deployment) leaving it at its default value of Always. (There is more discussion on Job restart policies in the Kubernetes documentation.)

You probably want a Deployment here. A Job is meant for a case where you do some set of batch processing and then the job completes; that's part of why kubectl wait doesn't have the lifecycle option you're looking for.
I'm guessing you want a Deployment instead. With what you've shown here I don't think you need to make any changes at all besides

apiVersion: apps/v1
kind: Deployment

Everything so far about Services and DNS and labels still applies.

You can kubectl wait for a Deployment to be available. Since a Job is expected to run to completion and exit, that's the state kubectl wait allows. A Deployment is "available" if there is at least a minimum number of managed Pods running that pass their health checks, which I think is the state you're after.

kubectl wait --for=condition=available deployment/my-project-listener

There are simpler ways to check for database liveness. A huge fraction of what you show here is an involved sequence with special permissions to see if the database is running before the pod starts up.

What happens if the database fails while the pod is running? One common thing that will happen is you'll get a cascading sequence of exceptions and your pod will crash. Then with restartPolicy: Always Kubernetes will try to restart it; but if the database still isn't available, it will crash again; and you'll get to a CrashLoopBackOff state. If the database does become available again then eventually Kubernetes will try to restart the Pod and it will succeed.

This same logic can apply at startup time. If the Pod tries to start up, and the database isn't ready yet, and it crashes, Kubernetes will by default restart it, adding some delays after the first couple of attempts. If the database starts up within 30 seconds or so then the application will be up within a minute or so. The restart count will be greater than 0, but kubectl logs --previous will hopefully have a clear exception.

This will let you delete about half of what you show here. Delete all of the initContainers: block; then, since you're not doing any Kubernetes API operations, delete the Role and RoleBinding objects too.

If you really do want to force the Pod to wait for the database and treat startup as a special case, I'd suggest a simpler shell script using the mysql client tool, or even the wait-for script that makes basic TCP calls (the mechanism described in Docker Compose wait for container X before starting Y). This still lets you avoid all of the Kubernetes RBAC setup.