why does the pod remain in pending state despite having toleration set-CodePudding

I applied the following taint, and label to a node but the pod never reaches a running status and I cannot seem to figure out why

kubectl taint node k8s-worker-2 dedicated=devs:NoSchedule
kubectl label node k8s-worker-2 dedicated=devs

and here is a sample of my pod yaml file:

apiVersion: v1
kind: Pod
metadata:
  creationTimestamp: null
  labels:
    security: s1
  name: pod-1
spec:
  containers:
  - image: nginx
    name: bear
    resources: {}
  tolerations:
  - key: "dedicated"
    operator: "Equal"
    value: "devs"
    effect: "NoSchedule"
  affinity:
   nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
     nodeSelectorTerms:
     - matchExpressions:
       - key: dedicated
         operator: In
         values:
         - devs
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  nodeName: k8s-master-2
status: {}

on creating the pod, it gets scheduled on the k8s-worker-2 node but remains in a pending state before it's finally evicted. Here are sample outputs:

kubectl describe no k8s-worker-2 | grep -i taint Taints: dedicated=devs:NoSchedule

NAME    READY   STATUS    RESTARTS   AGE   IP       NODE          NOMINATED NODE   READINESS GATES
pod-1   0/1     Pending   0          9s    <none>   k8s-master-2   <none>           <none>
# second check
NAME    READY   STATUS    RESTARTS   AGE   IP       NODE          NOMINATED NODE   READINESS GATES
pod-1   0/1     Pending   0          59s   <none>   k8s-master-2   <none>           <none>

Name:         pod-1
Namespace:    default
Priority:     0
Node:         k8s-master-2/
Labels:       security=s1
Annotations:  <none>
Status:       Pending
IP:
IPs:          <none>
Containers:
  bear:
    Image:        nginx
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-dzvml (ro)
Volumes:
  kube-api-access-dzvml:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 dedicated=devs:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>

Also, here is output of kubectl describe node

root@k8s-master-1:~/scheduling# kubectl describe nodes k8s-worker-2
Name:               k8s-worker-2
Roles:              <none>
Labels:             beta.kubernetes.io/arch=amd64
                    beta.kubernetes.io/os=linux
                    dedicated=devs
                    kubernetes.io/arch=amd64
                    kubernetes.io/hostname=k8s-worker-2
                    kubernetes.io/os=linux
Annotations:        kubeadm.alpha.kubernetes.io/cri-socket: /var/run/dockershim.sock
                    node.alpha.kubernetes.io/ttl: 0
                    projectcalico.org/IPv4Address: 10.128.0.4/32
                    projectcalico.org/IPv4IPIPTunnelAddr: 192.168.140.0
                    volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp:  Sun, 18 Jul 2021 16:18:41  0000
Taints:             dedicated=devs:NoSchedule
Unschedulable:      false
Lease:
  HolderIdentity:  k8s-worker-2
  AcquireTime:     <unset>
  RenewTime:       Sun, 10 Oct 2021 18:54:46  0000
Conditions:
  Type                 Status  LastHeartbeatTime                 LastTransitionTime                Reason                       Message
  ----                 ------  -----------------                 ------------------                ------                       -------
  NetworkUnavailable   False   Sun, 10 Oct 2021 18:48:50  0000   Sun, 10 Oct 2021 18:48:50  0000   CalicoIsUp                   Calico is running on this node
  MemoryPressure       False   Sun, 10 Oct 2021 18:53:40  0000   Mon, 04 Oct 2021 07:52:58  0000   KubeletHasSufficientMemory   kubelet has sufficient memory available
  DiskPressure         False   Sun, 10 Oct 2021 18:53:40  0000   Mon, 04 Oct 2021 07:52:58  0000   KubeletHasNoDiskPressure     kubelet has no disk pressure
  PIDPressure          False   Sun, 10 Oct 2021 18:53:40  0000   Mon, 04 Oct 2021 07:52:58  0000   KubeletHasSufficientPID      kubelet has sufficient PID available
  Ready                True    Sun, 10 Oct 2021 18:53:40  0000   Mon, 04 Oct 2021 07:52:58  0000   KubeletReady                 kubelet is posting ready status. AppArmor enabled
Addresses:
  InternalIP:  10.128.0.4
  Hostname:    k8s-worker-2
Capacity:
  cpu:                2
  ephemeral-storage:  20145724Ki
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8149492Ki
  pods:               110
Allocatable:
  cpu:                2
  ephemeral-storage:  18566299208
  hugepages-1Gi:      0
  hugepages-2Mi:      0
  memory:             8047092Ki
  pods:               110
System Info:
  Machine ID:                 3c2709a436fa0c630680bac68ad28669
  System UUID:                3c2709a4-36fa-0c63-0680-bac68ad28669
  Boot ID:                    18a3541f-f3b4-4345-ba45-8cfef9fb1364
  Kernel Version:             5.8.0-1038-gcp
  OS Image:                   Ubuntu 20.04.2 LTS
  Operating System:           linux
  Architecture:               amd64
  Container Runtime Version:  docker://20.10.7
  Kubelet Version:            v1.21.3
  Kube-Proxy Version:         v1.21.3
PodCIDR:                      192.168.2.0/24
PodCIDRs:                     192.168.2.0/24
Non-terminated Pods:          (2 in total)
  Namespace                   Name                 CPU Requests  CPU Limits  Memory Requests  Memory Limits  Age
  ---------                   ----                 ------------  ----------  ---------------  -------------  ---
  kube-system                 calico-node-gp4tk    250m (12%)    0 (0%)      0 (0%)           0 (0%)         84d
  kube-system                 kube-proxy-6xxgx     0 (0%)        0 (0%)      0 (0%)           0 (0%)         81d
Allocated resources:
  (Total limits may be over 100 percent, i.e., overcommitted.)
  Resource           Requests    Limits
  --------           --------    ------
  cpu                250m (12%)  0 (0%)
  memory             0 (0%)      0 (0%)
  ephemeral-storage  0 (0%)      0 (0%)
  hugepages-1Gi      0 (0%)      0 (0%)
  hugepages-2Mi      0 (0%)      0 (0%)
Events:
  Type     Reason                   Age                    From        Message
  ----     ------                   ----                   ----        -------
  Normal   Starting                 6m25s                  kubelet     Starting kubelet.
  Normal   NodeAllocatableEnforced  6m25s                  kubelet     Updated Node Allocatable limit across pods
  Normal   NodeHasSufficientMemory  6m19s (x7 over 6m25s)  kubelet     Node k8s-worker-2 status is now: NodeHasSufficientMemory
  Normal   NodeHasNoDiskPressure    6m19s (x7 over 6m25s)  kubelet     Node k8s-worker-2 status is now: NodeHasNoDiskPressure
  Normal   NodeHasSufficientPID     6m19s (x7 over 6m25s)  kubelet     Node k8s-worker-2 status is now: NodeHasSufficientPID
  Warning  Rebooted                 6m9s                   kubelet     Node k8s-worker-2 has been rebooted, boot id: 18a3541f-f3b4-4345-ba45-8cfef9fb1364
  Normal   Starting                 6m7s                   kube-proxy  Starting kube-proxy.

Included the following to show that the pod never issues events and it terminates later on by itself.

root@k8s-master-1:~/format/scheduling# kubectl get po
No resources found in default namespace.
root@k8s-master-1:~/format/scheduling# kubectl create -f nginx.yaml
pod/pod-1 created
root@k8s-master-1:~/format/scheduling# kubectl get po pod-1
NAME    READY   STATUS    RESTARTS   AGE
pod-1   0/1     Pending   0          10s
root@k8s-master-1:~/format/scheduling# kubectl describe po pod-1
Name:         pod-1
Namespace:    default
Priority:     0
Node:         k8s-master-2/
Labels:       security=s1
Annotations:  <none>
Status:       Pending
IP:
IPs:          <none>
Containers:
  bear:
    Image:        nginx
    Port:         <none>
    Host Port:    <none>
    Environment:  <none>
    Mounts:
      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-5hsq4 (ro)
Volumes:
  kube-api-access-5hsq4:
    Type:                    Projected (a volume that contains injected data from multiple sources)
    TokenExpirationSeconds:  3607
    ConfigMapName:           kube-root-ca.crt
    ConfigMapOptional:       <nil>
    DownwardAPI:             true
QoS Class:                   BestEffort
Node-Selectors:              <none>
Tolerations:                 dedicated=devs:NoSchedule
                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:                      <none>
root@k8s-master-1:~/format/scheduling# kubectl get po pod-1
NAME    READY   STATUS    RESTARTS   AGE
pod-1   0/1     Pending   0          45s
root@k8s-master-1:~/format/scheduling# kubectl get po pod-1
NAME    READY   STATUS    RESTARTS   AGE
pod-1   0/1     Pending   0          62s
root@k8s-master-1:~/format/scheduling# kubectl get po pod-1
NAME    READY   STATUS    RESTARTS   AGE
pod-1   0/1     Pending   0          74s
root@k8s-master-1:~/format/scheduling# kubectl get po pod-1
Error from server (NotFound): pods "pod-1" not found
root@k8s-master-1:~/format/scheduling# kubectl get po
No resources found in default namespace.
root@k8s-master-1:~/format/scheduling#

CodePudding user response：

I was able to figure this one out later. On reproducing the same case on another cluster, the pod got created on the node having the scheduling parameters set. Then it occurred to me that the only change I had to make on the manifest was setting nodeName: node-1 to match the right node on other cluster. I was literally assigning the pod to a control plane node nodeName: k8s-master-2 and this was causing conflicts.

CodePudding user response：

on creating the pod, it gets scheduled on the k8s-worker-2 node but remains in a pending state before it's finally evicted.

Hope you node have proper resource left and free, that could be also reason behind pod getting evicted due to resources issue.

https://sysdig.com/blog/kubernetes-pod-evicted/