Home > Enterprise >  AWS eks pod runs into CrahsLoopBackOff status
AWS eks pod runs into CrahsLoopBackOff status

Time:09-14

I'm new to EKS, so trying to do a POC with an aim to deploy an image to an EKS cluster. The image is holding a python script. I'm hoping to trigger the python script based on some event and run it in the EKS pod. I have managed to setup an EKS cluster and but all my pods end in status CrashLoopBackoff.

Here is my image dockerfile, published to aws ECR.

FROM python:3.9-buster

COPY requirements.txt .
RUN pip install -r requirements.txt

COPY job.py /opt/app/jobs/

# Install aws
RUN curl "https://d1vvhvl2y92vvt.cloudfront.net/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
RUN unzip awscliv2.zip
RUN ./aws/install

CMD ["bash"]

My deployment.yaml is configured as:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: python-job-deployment
  namespace: python-job-namespace
  labels:
    app: python-job-app
spec:
  replicas: 3
  selector:
    matchLabels:
      app: python-job-app
  template:
    metadata:
      labels:
        app: python-job-app
    spec:
      containers:
      - name: python-job
        image: xxxx.dkr.ecr.$region.amazonaws.com/python-job:latest
        ports:
        - containerPort: 80

and my service.yaml:

apiVersion: v1
kind: Service
metadata:
  name: python-job-service
  namespace: python-job-namespace
  labels:
    app: python-job-app
spec:
  selector:
    app: python-job-app
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80

When trying to run kubectl get all -n my-job-namespace, I received the following output:

NAME                                                    READY   STATUS             RESTARTS       AGE
pod/python-job-deployment-5998c7b884-7zwff   0/1     CrashLoopBackOff   10 (17s ago)   26m
pod/python-job-deployment-5998c7b884-b6gkp   0/1     CrashLoopBackOff   10 (19s ago)   26m
pod/python-job-deployment-5998c7b884-w5tx9   0/1     CrashLoopBackOff   10 (27s ago)   26m

NAME                                    TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/python-job-service   ClusterIP   x.x.x.x   <none>        80/TCP    5h21m

NAME                                               READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/python-job-deployment   0/3     3            0           26m

NAME                                                          DESIRED   CURRENT   READY   AGE
replicaset.apps/python-job-deployment-5998c7b884   3         3         0       26m

Describing pod kubectl describe pod $podname shows:

Name:             python-job-deployment-5998c7b884-7zwff
Namespace:        python-job-namespace
Priority:         0
Service Account:  default
Node:             ip-x-x-x-x.$region.compute.internal/x.x.x.x
Start Time:       Tue, 13 Sep 2022 15:36:41  1000
Labels:           app=python-job-app
                  pod-template-hash=5998c7b884
Annotations:      kubernetes.io/psp: eks.privileged
Status:           Running
IP:               x.x.x.x
IPs:
  IP:           x.x.x.x
Controlled By:  ReplicaSet/python-job-deployment-5998c7b884
Containers:
  raw2std:
    Container ID:   docker://12b686344412e95968780d4bfd18ce357714f14a53b892a16bff92c49c4bef3a
    Image:          xxxxxxx.dkr.ecr.$region.amazonaws.com/python-job:latest
    Image ID:       docker-pullable://xxxxx.dkr.ecr.$region.amazonaws.com/python-job@sha256:58283c4eee275e13b8f8c531b32ed7027a667a0775f32db458c69b1b5531e5d2
    Port:           80/TCP
    Host Port:      0/TCP
    State:          Waiting
      Reason:       CrashLoopBackOff
    Last State:     Terminated
      Reason:       Completed
      Exit Code:    0
      Started:      Tue, 13 Sep 2022 16:03:19  1000
      Finished:     Tue, 13 Sep 2022 16:03:19  1000
    Ready:          False
    Restart Count:  10
    Environment:    <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-bcqp4 (ro)
Conditions:
  Type              Status
  Initialized       True
  Ready             False
  ContainersReady   False
  PodScheduled      True
Volumes:
  kube-api-access-bcqp4:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
  Type     Reason     Age                    From               Message
  ----     ------     ----                   ----               -------
  Normal   Scheduled  29m                    default-scheduler  Successfully assigned python-job-namespace/python-job-deployment-5998c7b884-7zwff to ip-x-x-x-x.$region.compute.internal
  Normal   Pulled     29m                    kubelet            Successfully pulled image "xxxxxxxx.dkr.ecr.$region.amazonaws.com/python-job:latest" in 89.681294ms
  Normal   Pulled     29m                    kubelet            Successfully pulled image "xxxxxx.dkr.ecr.$region.amazonaws.com/python-job:latest" in 94.853272ms
  Normal   Pulled     29m                    kubelet            Successfully pulled image "xxxxxx.dkr.ecr.$region.amazonaws.com/python-job:latest" in 92.416417ms
  Normal   Created    29m (x4 over 29m)      kubelet            Created container raw2std
  Normal   Started    29m (x4 over 29m)      kubelet            Started container raw2std
  Normal   Pulled     29m                    kubelet            Successfully pulled image "xxxxx.dkr.ecr.$region.amazonaws.com/python-job:latest" in 111.313488ms
  Normal   Pulling    28m (x5 over 29m)      kubelet            Pulling image "xxxxx.dkr.ecr.$region.amazonaws.com/python-job:latest"
  Warning  BackOff    4m54s (x116 over 29m)  kubelet            Back-off restarting failed container

I have also tried to retrieve the pod log but the command didn't generate any output kubectr -n python-job-namespace $podname

CodePudding user response:

The issue is most likely due to your docker image. With that image the container won't keep running since u have only given bash in CMD. Replace the

CMD ["bash"]

with

ENTRYPOINT ["tail", "-f", "/dev/null"]

and try. This is just to keep the container running infinitely. (Its better to run your program in CMD)

  • Related