Home > Software design >  GitlabRunner - tcp://localhost:2375. Is the docker daemon running?
GitlabRunner - tcp://localhost:2375. Is the docker daemon running?

Time:11-16

I'm trying to install a gitlab-runner on EC2. The executor that I want is Docker.

My config.toml is

concurrent = 10
check_interval = 0

[session_server]
  session_timeout = 1800

[[runners]]
  name = "My Docker Runner"
  url = "https://gitlab.com/"
  token = "SECRET"
  executor = "docker"
  [runners.custom_build_dir]
  [runners.cache]
    [runners.cache.s3]
    [runners.cache.gcs]
    [runners.cache.azure]
  [runners.docker]
    tls_verify = false
    image = "docker:19.03.12"
    privileged = true
    disable_entrypoint_overwrite = false
    oom_kill_disable = false
    disable_cache = false
    volumes = ["/certs/client", "/cache"]
    shm_size = 0

My .gitlab-ci.yml is

image: docker:19.03.12

variables:
  # When you use the dind service, you must instruct Docker to talk with
  # the daemon started inside of the service. The daemon is available
  # with a network connection instead of the default
  # /var/run/docker.sock socket. Docker 19.03 does this automatically
  # by setting the DOCKER_HOST in
  # https://github.com/docker-library/docker/blob/d45051476babc297257df490d22cbd806f1b11e4/19.03/docker-entrypoint.sh#L23-L29
  #
  # The 'docker' hostname is the alias of the service container as described at
  # https://docs.gitlab.com/ee/ci/docker/using_docker_images.html#accessing-the-services.
  #
  # Specify to Docker where to create the certificates. Docker
  # creates them automatically on boot, and creates
  # `/certs/client` to share between the service and job
  # container, thanks to volume mount from config.toml
  DOCKER_TLS_CERTDIR: "/certs"

services:
  - docker:19.03.12-dind

before_script:
  - docker info

build:
  stage: build
  script:
    - docker build -t my-docker-image .
    - docker run my-docker-image /script/to/run/tests

I always have this error:

Running with gitlab-runner 14.4.0 (4b9e985a)
  on My Docker Runner u9_6MpHg
Resolving secrets
00:00
Preparing the "docker" executor
Using Docker executor with image docker:19.03.12 ...
Starting service docker:19.03.12-dind ...
Authenticating with credentials from $DOCKER_AUTH_CONFIG
Pulling docker image docker:19.03.12-dind ...
Using docker image sha256:66dc2d45749a48592f4348fb3d567bdd65c9dbd5402a413b6d169619e32f6bd2 for docker:19.03.12-dind with digest docker@sha256:674f1f40ff7c8ac14f5d8b6b28d8fb1f182647ff75304d018003f1e21a0d8771 ...
Waiting for services to be up and running...
Authenticating with credentials from $DOCKER_AUTH_CONFIG
Pulling docker image docker:19.03.12 ...
Using docker image sha256:81f5749c9058a7284e6acd8e126f2b882765a17b9ead14422b51cde1a110b85c for docker:19.03.12 with digest docker@sha256:d41efe7ad0df5a709cfd4e627c7e45104f39bbc08b1b40d7fb718c562b3ce135 ...
Preparing environment
00:00
Running on runner-u96mphg-project-31310309-concurrent-0 via ip-10-120-65-72.ec2.internal...
Getting source from Git repository
00:02
Fetching changes with git depth set to 50...
Initialized empty Git repository in /builds/mediagrif/itt/network/poc/test-private-runner/.git/
Created fresh repository.
Checking out 3d7fe999 as main...
Skipping Git submodules setup
Executing "step_script" stage of the job script
00:01
Using docker image sha256:81f5749c9058a7284e6acd8e126f2b882765a17b9ead14422b51cde1a110b85c for docker:19.03.12 with digest docker@sha256:d41efe7ad0df5a709cfd4e627c7e45104f39bbc08b1b40d7fb718c562b3ce135 ...
$ docker info
Client:
 Debug Mode: false
Server:
**ERROR: Cannot connect to the Docker daemon at tcp://localhost:2375. Is the docker daemon running?**
errors pretty printing info
Cleaning up project directory and file based variables
00:00
ERROR: Job failed: exit code 1

I tried all day long and follow all the instructions on the gitlab documentation and nothing works. I'm always getting the same error. I tried with shell executor, docker and docker machine executor and I have the same error.

I tried to use DinD, direct Socket and Shell executor to build my Docker image.

I tried to specify DOCKER_HOST, service alias, disabling certificate.

What I found strange is that even if I change the DOCKER_HOST in my gitlab-ci, when I look at /etc/hosts, I see the record for the service, but the error message is always pointing on localhost.

I tried to use version 13.11.0 and 14.4.0 of Gitlab Runner. I tried to install the runner it with YUM. I also tried to run it with Docker run. I also tried in my gitlab-ci file to use Docker 19 and Docker 20.

Nothing works.

Does somebody have an hint for me please?

Thanks

Yann

CodePudding user response:

There are several things that may be going on here, but it sounds like you've tried the basic DOCKER_HOST stuff. Generally, DinD will set the host to what's necessary, so there is some issue with DinD connecting to your docker daemon on the host. Here are a couple things to try:

  1. SSH into your GitLab runner, and run docker ps to ensure that the socket is running properly. It's possible that the socket is not set to run on startup.
  2. When you're connected to your box via SSH, ensure that you can access docker without the use of sudo. If your gitlab-runner user needs to use sudo to access docker, you will get errors.
  3. Start a DinD container on your runner box, passing in the privileged flag, and attempt to access docker from within the DinD container.

Odds are good that the error is how docker is configured on the host - nothing looks wrong with your runner toml or your CI yml.

CodePudding user response:

Two things:

When using docker:dind service, the hostname of the docker daemon is docker not localhost. The GitLab docs kind of contradict themselves here.

While the docker:19.03.12 image does set the docker host correctly, in some cases, you do sometimes need to specify DOCKER_HOST for the benefit of the dind container itself, which has a totally different entrypoint that can result in the docker host being set as tcp://localhost:2375 which won't work when TLS is enabled. Or if

Also, when specifying DOCKER_TLS_CERTDIR, TLS is enabled by default and the TLS-enabled listening port is 2376 not 2375.

To correct this, make either of the following configuration changes:

variables:
  DOCKER_TLS_CERTDIR: "/certs"
  DOCKER_HOST: "tcp://docker:2376" # dind with TLS enabled

OR

varaibles:
  DOCKER_TLS_CERTDIR: ""
  DOCKER_HOST: "tcp://docker:2375" # dind with TLS disabled

You should also double check that your certificate directory actually contains the proper certificates or that the mount point is writable, otherwise if the certs are missing, dind treats it as if TLS is disabled.

If your job seems to use localhost:2375 despite your environment variable, it must be because this variable is being overridden somewhere, like being set at the project or group level CI/CD settings, which would override your YAML configuration.

You can confirm this in your job script by echoing the value:

script:
  - echo $DOCKER_HOST
  - docker info
  • Related