Hey I have a kubernetes cluster for a gitlab ci/cd pipeline. There is a gitlab runner (kubernetes executor) running on it.
Sometimes the pipeline passes, but sometime I get
Waiting for pod gitlab-runner/runner-wyplq6-h-project-7180-concurrent-0lr66z to be running, status is Pending
ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
ContainersNotReady: "containers with unready status: [build helper]"
ContainersNotReady: "containers with unready status: [build helper]"
ERROR: Job failed (system failure): prepare environment: waiting for pod running: timed out waiting for pod to start. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information
I checked the link but it says the kubernetes executor should not cause any problem with shell profiles.
So I ran kubectl describe pod gitlab-runner/runner-wyplq6-h-project-7180-concurrent-0lr66z
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 40s default-scheduler Successfully assigned gitlab-runner/runner-wyplq6-h-project-7180-concurrent-0lr66z to bloxberg
Warning FailedCreatePodContainer 5s kubelet unable to ensure pod container exists: failed to create container for [kubepods besteffort pod6fe2669a-ae7f-47e3-8794-814767c14895] : Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)
Why does the runner fail to start systemd
is there a way to fix that?
CodePudding user response:
According to the troubleshooting docs for the kubernetes executor, you should increase your poll_timeout
value in your config.toml
:
The following errors are commonly encountered when using the Kubernetes executor.
Job failed (system failure): timed out waiting for pod to start
If the cluster cannot schedule the build pod before the timeout defined by poll_timeout, the build pod returns an error. The Kubernetes Scheduler should be able to delete it.To fix this issue, increase the poll_timeout value in your config.toml file.
CodePudding user response:
Job failed (system failure): timed out waiting for pod to start
The following error occurs if the cluster cannot schedule the build pod before the timeout defined by poll_timeout, the build pod returns an error. The Kubernetes Scheduler should be able to delete it.
To fix this issue, increase the poll_timeout
value in your config.toml
file.
Error cleaning up pod and Job failed (system failure): prepare environment: waiting for pod running
The following error occurs when Kubernetes fails to schedule the job pod in a timely manner. GitLab Runner waits for the pod to be ready, but it fails and then tries to clean up the pod, which can also fail.
To troubleshoot, check the Kubernetes primary node and all nodes that run a kube-apiserver
instance. Ensure they have all of the resources needed to manage the target number of pods that you hope to scale up to on the cluster.
To change the time GitLab Runner waits for a pod to reach its Ready status, use the poll_timeout setting.
To better understand how pods are scheduled or why they might not get scheduled on time, read about the Kubernetes Scheduler.
Refer to the troubleshooting documentation for more details.