I'm trying to build a docker image that I can use as a development environment for modifying Pytorch. There is a Dockerfile provided in the repo, and I'm trying the following:
git clone --recursive https://github.com/pytorch/pytorch
cd pytorch
DOCKER_BUILDKIT=1 docker build -t pytorchtest .
But the docker build results in the following error:
...
#20 28.80 Performing C SOURCE FILE Test HAS_WERROR_CAST_FUNCTION_TYPE failed with the following output:
#20 28.80 Change Dir: /opt/pytorch/build/CMakeFiles/CMakeTmp
#20 28.80
#20 28.80 Run Build Command(s):/usr/bin/make -f Makefile cmTC_09005/fast && /usr/bin/make -f CMakeFiles/cmTC_09005.dir/build.make CMakeFiles/cmTC_09005.dir/build
#20 28.80 make[1]: Entering directory '/opt/pytorch/build/CMakeFiles/CMakeTmp'
#20 28.80 Building CXX object CMakeFiles/cmTC_09005.dir/src.cxx.o
#20 28.80 /usr/bin/c -DHAS_WERROR_CAST_FUNCTION_TYPE -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-unused-parameter -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -fPIE -Werror=cast-function-type -o CMakeFiles/cmTC_09005.dir/src.cxx.o -c /opt/pytorch/build/CMakeFiles/CMakeTmp/src.cxx
#20 28.80 cc1plus: error: -Werror=cast-function-type: no option -Wcast-function-type
#20 28.80 CMakeFiles/cmTC_09005.dir/build.make:77: recipe for target 'CMakeFiles/cmTC_09005.dir/src.cxx.o' failed
#20 28.80 make[1]: *** [CMakeFiles/cmTC_09005.dir/src.cxx.o] Error 1
#20 28.80 make[1]: Leaving directory '/opt/pytorch/build/CMakeFiles/CMakeTmp'
#20 28.80 Makefile:127: recipe for target 'cmTC_09005/fast' failed
#20 28.80 make: *** [cmTC_09005/fast] Error 2
#20 28.80
#20 28.80
#20 28.80 Source file was:
#20 28.80 int main() { return 0; }
#20 DONE 29.0s
------
executor failed running [/bin/sh -c TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0 PTX 8.0" TORCH_NVCC_FLAGS="-Xfatbin -compress-all"
CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" python setup.py install]: exit code: 1
I cannot get the error logs because they exist in the temporary filesystem for the image building process.
I find it somewhat strange that a building a stable release image is failing. Am I doing something wrong?
The Dockerfile:
# syntax = docker/dockerfile:experimental
#
# NOTE: To build this you will need a docker version > 18.06 with
# experimental enabled and DOCKER_BUILDKIT=1
#
# If you do not use buildkit you are not going to have a good time
#
# For reference:
# https://docs.docker.com/develop/develop-images/build_enhancements/
ARG BASE_IMAGE=ubuntu:18.04
ARG PYTHON_VERSION=3.8
FROM ${BASE_IMAGE} as dev-base
RUN apt-get update && apt-get install -y --no-install-recommends \
build-essential \
ca-certificates \
ccache \
# cmake=3.10.2-1ubuntu2.18.04.2 \
cmake \
curl \
git \
libjpeg-dev \
libpng-dev && \
rm -rf /var/lib/apt/lists/*
RUN /usr/sbin/update-ccache-symlinks
RUN mkdir /opt/ccache && ccache --set-config=cache_dir=/opt/ccache
ENV PATH /opt/conda/bin:$PATH
FROM dev-base as conda
ARG PYTHON_VERSION=3.8
# Automatically set by buildx
ARG TARGETPLATFORM
# translating Docker's TARGETPLATFORM into miniconda arches
RUN case ${TARGETPLATFORM} in \
"linux/arm64") MINICONDA_ARCH=aarch64 ;; \
*) MINICONDA_ARCH=x86_64 ;; \
esac && \
curl -fsSL -v -o ~/miniconda.sh -O "https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-${MINICONDA_ARCH}.sh"
COPY requirements.txt .
RUN chmod x ~/miniconda.sh && \
~/miniconda.sh -b -p /opt/conda && \
rm ~/miniconda.sh && \
/opt/conda/bin/conda install -y python=${PYTHON_VERSION} cmake conda-build pyyaml numpy ipython && \
/opt/conda/bin/python -mpip install -r requirements.txt && \
/opt/conda/bin/conda clean -ya
FROM dev-base as submodule-update
WORKDIR /opt/pytorch
COPY . .
RUN git submodule update --init --recursive --jobs 0
FROM conda as build
WORKDIR /opt/pytorch
COPY --from=conda /opt/conda /opt/conda
COPY --from=submodule-update /opt/pytorch /opt/pytorch
RUN --mount=type=cache,target=/opt/ccache \
TORCH_CUDA_ARCH_LIST="3.5 5.2 6.0 6.1 7.0 PTX 8.0" TORCH_NVCC_FLAGS="-Xfatbin -compress-all" \
CMAKE_PREFIX_PATH="$(dirname $(which conda))/../" \
python setup.py install || cat /opt/pytorch/build/CMakeFiles/CMakeError.log
CodePudding user response:
The issue was with the COPY --from=submodule-update /opt/pytorch /opt/pytorch
instruction. Some .bzl
files were not getting copied. More precisely they were not getting added to the Docker build context because of a .dockerignore
file. I've added the following line to the end of the .dockerignore
and now it works:
!*.bzl
As far as I understand, this is a bug. These files are committed to the repo, so they should get copied.