why does gitlab runner fail to start "org.freedesktop.systemd1"?-CodePudding

Hey I have a kubernetes cluster for a gitlab ci/cd pipeline. There is a gitlab runner (kubernetes executor) running on it.

Sometimes the pipeline passes, but sometime I get

Waiting for pod gitlab-runner/runner-wyplq6-h-project-7180-concurrent-0lr66z to be running, status is Pending
    ContainersNotInitialized: "containers with incomplete status: [init-permissions]"
    ContainersNotReady: "containers with unready status: [build helper]"
    ContainersNotReady: "containers with unready status: [build helper]"
ERROR: Job failed (system failure): prepare environment: waiting for pod running: timed out waiting for pod to start. Check https://docs.gitlab.com/runner/shells/index.html#shell-profile-loading for more information

I checked the link but it says the kubernetes executor should not cause any problem with shell profiles. So I ran kubectl describe pod gitlab-runner/runner-wyplq6-h-project-7180-concurrent-0lr66z

...
Events:
  Type     Reason                    Age   From               Message
  ----     ------                    ----  ----               -------
  Normal   Scheduled                 40s   default-scheduler  Successfully assigned gitlab-runner/runner-wyplq6-h-project-7180-concurrent-0lr66z to bloxberg
  Warning  FailedCreatePodContainer  5s    kubelet            unable to ensure pod container exists: failed to create container for [kubepods besteffort pod6fe2669a-ae7f-47e3-8794-814767c14895] : Failed to activate service 'org.freedesktop.systemd1': timed out (service_start_timeout=25000ms)

Why does the runner fail to start systemdis there a way to fix that?

CodePudding user response：

According to the troubleshooting docs for the kubernetes executor, you should increase your poll_timeout value in your config.toml:

The following errors are commonly encountered when using the Kubernetes executor.

Job failed (system failure): timed out waiting for pod to start
If the cluster cannot schedule the build pod before the timeout defined by poll_timeout, the build pod returns an error. The Kubernetes Scheduler should be able to delete it.

To fix this issue, increase the poll_timeout value in your config.toml file.

CodePudding user response：

Job failed (system failure): timed out waiting for pod to start

The following error occurs if the cluster cannot schedule the build pod before the timeout defined by poll_timeout, the build pod returns an error. The Kubernetes Scheduler should be able to delete it. To fix this issue, increase the poll_timeout value in your config.toml file.

Error cleaning up pod and Job failed (system failure): prepare environment: waiting for pod running

The following error occurs when Kubernetes fails to schedule the job pod in a timely manner. GitLab Runner waits for the pod to be ready, but it fails and then tries to clean up the pod, which can also fail.

To troubleshoot, check the Kubernetes primary node and all nodes that run a kube-apiserver instance. Ensure they have all of the resources needed to manage the target number of pods that you hope to scale up to on the cluster.

To change the time GitLab Runner waits for a pod to reach its Ready status, use the poll_timeout setting.

To better understand how pods are scheduled or why they might not get scheduled on time, read about the Kubernetes Scheduler.

Refer to the troubleshooting documentation for more details.