Docker multi platform builds extremely slow for ARM64 on Gitlab CI-CodePudding

I have the following dockerfile for a Node.js application

# ---> Build stage
FROM node:18-bullseye as node-build

ENV NODE_ENV=production
WORKDIR /usr/src/app
COPY . /usr/src/app/
RUN yarn install --silent --production=true --frozen-lockfile
RUN yarn build --silent

# ---> Serve stage
FROM nginx:stable-alpine
COPY --from=node-build /usr/src/app/dist /usr/share/nginx/html

Up until now I was building exclusively for AMD64, but now I need to build also for ARM64.

I edited my .gitlab-ci.yml to look like the following

image: docker:20

variables:
    PROJECT_NAME: "project"
    BRANCH_NAME: "main"
    IMAGE_NAME: "$PROJECT_NAME:$CI_COMMIT_TAG"

services:
    - docker:20-dind

build_image:
    script:
      # Push to Gitlab registry
      - docker login $CI_REGISTRY -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD
      - docker context create builder-context
      - docker buildx create --name builderx --driver docker-container --use builder-context
      - docker buildx build --tag $CI_REGISTRY/mygroup/$PROJECT_NAME/$IMAGE_NAME --push --platform=linux/arm64/v8,linux/amd64 .

Everything works relatively fine for AMD64 but it is extremely slow for ARM64. Almost 10x slower than AMD64, giving me timeouts on the Gitlab Job.

Is there any way to speed up the process?

CodePudding user response：

I'm guessing your pipeline is executing on amd64 hardware and that docker buildx is performing emulation to build the arm64 target. You will likely see a large improvement if you break build_image into two jobs (one for amd64 and one for arm64) and then send them to two different gitlab runners so that they each can execute on their native hardware.

Even if you can't or don't want stop using emulation, you could still break the build_image job into two jobs (one per image built) in hopes that running them in parallel will allow the jobs to finish before the timeout limit.

With changes to your Dockerfile and the use of image caching you can make some of your subsequent builds faster, but these changes won't help you until you get an initial image built (which can be used as the cache).

Updated Dockerfile:

# ---> Build stage
FROM node:18-bullseye as node-build

ENV NODE_ENV=production
WORKDIR /usr/src/app
# only COPY yarn.lock so not to break cache if dependencies have not changed
COPY . /usr/src/app/yarn.lock 
RUN yarn install --silent --production=true --frozen-lockfile
# once the dependencies are installed, then copy in the frequently changing source code files
COPY . /usr/src/app/
RUN yarn build --silent

# ---> Serve stage
FROM nginx:stable-alpine
COPY --from=node-build /usr/src/app/dist /usr/share/nginx/html

Updated gitlab-ci.yml:

image: docker:20

variables:
    PROJECT_NAME: "project"
    BRANCH_NAME: "main"
    IMAGE_NAME: "$PROJECT_NAME:$CI_COMMIT_TAG"
    REGISTRY_IMAGE_NAME: "$CI_REGISTRY/mygroup/$PROJECT_NAME/$IMAGE_NAME"
    CACHE_IMAGE_NAME: "$CI_REGISTRY/mygroup/$PROJECT_NAME/$PROJECT_NAME:cache"
    BUILDKIT_INLINE_CACHE: "1"

services:
    - docker:20-dind

stages:
   - build
   - push

before_script:
   - docker login $CI_REGISTRY -u $CI_REGISTRY_USER -p $CI_REGISTRY_PASSWORD
   - docker context create builder-context
   - docker buildx create --name builderx --driver docker-container --use builder-context

build_amd64:
    stage: build
    script:
      - docker buildx build --cache-from "$CACHE_IMAGE_NAME" --tag "$CACHE_IMAGE_NAME"  --push --platform=linux/amd64 .

build_arm64:
    stage: build
    script:
      - docker buildx build --cache-from "$CACHE_IMAGE_NAME" --tag "$CACHE_IMAGE_NAME"  --push --platform=linux/arm64/v8 .

push:
   stage: push
   script: 
     - docker buildx build --cache-from "$CACHE_IMAGE_NAME" --tag "$REGISTRY_IMAGE_NAME"  --push --platform=linux/arm64/v8,linux/amd64 .

The build_amd64 and build_arm64 jobs each pull in the last image (of their arch) that was built and use it as a cache for docker image layers. These two build jobs then push their result back as the new cache.

The push stage runs docker buildx ... again, but they won't actually build anything new as they will just pull in the cached results from the two build jobs. This allows you to break up the builds but still have a single push command that results in the two different images ending up in a single multi-platform docker manifest.