Is there a way to enable shareProcessNamespace for helm post-install hook?-CodePudding

I'm running a pod with 3 containers (telegraf, fluentd and an in-house agent) that makes use of shareProcessNamespace: true.

I've written a python script to fetch the initial config for telegraf and fluentd from a central controller API endpoint. Since this is a one time operation, I plan to use helm post-install hook.

apiVersion: batch/v1
kind: Job
metadata:
  name: agent-postinstall
  annotations:
    "helm.sh/hook-weight": "3"
    "helm.sh/hook": "post-install"
spec:
  template:
    spec:
      containers:
      - name: agent-postinstall
        image: "{{ .Values.image.agent.repository }}:{{ .Values.image.agent.tag | default .Chart.AppVersion }}"
        imagePullPolicy: IfNotPresent
        command: ['python3', 'getBaseCfg.py']
        volumeMounts:
          - name: config-agent-volume
            mountPath: /etc/config
      volumes:
        - name: config-agent-volume
          configMap:
            name: agent-cm
      restartPolicy: Never
  backoffLimit: 1

It is required for the python script to check if telegraf/fluentd/agent processes are up, before getting the config. I intend to wait (with a timeout) until pgrep <telegraf/fluentd/agent> returns true and then fire APIs. Is there a way to enable shareProcessNamespace for the post-install hook as well? Thanks.

PS: Currently, the agent calls the python script along with its own startup script. It works, but it is kludgy. I'd like to move it out of agent container.

CodePudding user response：

shareProcessNamespace

Most important part of this flag is it works only within one pod, all containers within one pod will share processes between each other.

In described approach job is supposed to be used. Job creates a separate pod so it won't work this way. Container should be a part of the "main" pod with all other containers to have access to running processes of that pod.

More details about process sharing.

Possible way to solution it

It's possible to get processes from the containers directly using kubectl command.

Below is an example how to check state of the processes using pgrep command. The pgrepContainer container needs to have the pgrep command already installed.

job.yaml:

apiVersion: batch/v1
kind: Job
metadata:
  name: "{{ .Release.Name }}-postinstall-hook"
  annotations: "helm.sh/hook": post-install
spec:
  template:
    spec:
      serviceAccountName: config-user # service account with appropriate permissions is required using this approach
      volumes:
      - name: check-script
        configMap:
          name: check-script
      restartPolicy: Never
      containers:
      - name: post-install-job
        image: "bitnami/kubectl" # using this image with kubectl so we can connect to the cluster
        command: ["bash", "/mnt/script/checkScript.sh"]
        volumeMounts:
        - name: check-script
          mountPath: /mnt/script

And configmap.yaml which contains script and logic which check three processes in loop for 60 iterations per 10 seconds each:

apiVersion: v1
kind: ConfigMap
metadata:
  name: check-script
data:
  checkScript.sh: | 
    #!/bin/bash
     podName=test
     pgrepContainer=app-1
     process1=sleep
     process2=pause
     process3=postgres
     attempts=0
    
   until [ $attempts -eq 60 ]; do 
     kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process1} 1>/dev/null 2>&1 \
     && kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process2} 1>/dev/null 2>&1 \
     && kubectl exec ${podName} -c ${pgrepContainer} -- pgrep ${process3} 1>/dev/null 2>&1 
   
     if [ $? -eq 0 ]; then
       break
     fi
   
     attempts=$((attempts   1))
     sleep 10
     echo "Waiting for all containers to be ready...$[ ${attempts}*10 ] s"
   done
 
   if [ $attempts -eq 60 ]; then
     echo "ERROR: Timeout"
     exit 1
   fi
 
   echo "All containers are ready !"
   echo "Configuring telegraf and fluentd services"

Final result will look like:

$ kubectl get pods
NAME                        READY   STATUS     RESTARTS  AGE
test                        2/2     Running    0         20m
test-postinstall-hook-dgrc9 0/1     Completed  0         20m

$ kubectl logs test-postinstall-hook-dgrc9
Waiting for all containers to be ready...10 s
All containers are ready !
Configuring telegraf and fluentd services

Above is an another approach, you can use its logic as base to achieve your end goal.

postStart

Also postStart hook can be considered to be used where some logic will be located. It will run after container is created. Since main application takes time to start and there's already logic which waits for it, it's not an issue that:

there is no guarantee that the hook will execute before the container ENTRYPOINT