Some background : I'm new to understanding docker images and containers and how to write DOCKERFILE. I currently have a Dockerfile which installs all the dependencies that I want through PIP install command and so, it was very simple to build and deploy images. But I currently have a new requirement to use the Dateinfer module and that cannot be installed through the pip install command. The repo has to be first cloned and then has to be installed and I'm having difficulty achieving this through a DOCKERFILE. The current work around I've been following for now is to run the container and install it manually in the directory with all the other dependencies and Committing the changes with dateinfer installed.But this is a very tedious and time consuming process and I want to achieve the same by just mentioning it in the DOCKERFILE along with all my other dependencies.
This is what my Dockerfile looks like:
FROM ubuntu:20.04
RUN apt update
RUN apt upgrade -y
RUN apt-get install -y python3
RUN apt-get install -y python3-pip
RUN DEBIAN_FRONTEND=noninteractive TZ=Etc/UTC apt-get -y install tzdata
RUN apt-get install -y libenchant1c2a
RUN apt install git -y
RUN pip3 install argparse
RUN pip3 install boto3
RUN pip3 install numpy==1.19.1
RUN pip3 install scipy
RUN pip3 install pandas
RUN pip3 install scikit-learn
RUN pip3 install matplotlib
RUN pip3 install plotly
RUN pip3 install kaleido
RUN pip3 install fpdf
RUN pip3 install regex
RUN pip3 install pyenchant
RUN pip3 install openpyxl
ADD core.py /
ENTRYPOINT [ "/usr/bin/python3.8", "/core.py”]
So when I try to install Dateinfer like this:
RUN git clone https://github.com/nedap/dateinfer.git
RUN cd dateinfer
RUN pip3 install .
It throws the following error : ERROR: Directory '.' is not installable. Neither 'setup.py' nor 'pyproject.toml' found. The command '/bin/sh -c pip3 install .' returned a non-zero code: 1
How do I solve this?
CodePudding user response:
Each RUN
directive in a Dockerfile
runs in its own subshell. If you write something like this:
RUN cd dateinfer
That is a no-op: it starts a new shell, changes directory, and then the shell exits. When the next RUN
command executes, you're back in the /
directory.
The easiest way of resolving this is to include your commands in a single RUN
statement:
RUN git clone https://github.com/nedap/dateinfer.git && \
cd dateinfer && \
pip3 install .
In fact, you would benefit from doing this with your other pip install
commands as well; rather than a bunch of individual RUN
commands, consider instead:
RUN pip3 install \
argparse \
boto3 \
numpy==1.19.1 \
scipy \
pandas \
scikit-learn \
matplotlib \
plotly \
kaleido \
fpdf \
regex \
pyenchant \
openpyxl
That will generally be faster because pip
only needs to resolve
dependencies once.
Rather than specifying all the packages individually on the command
line, you could also put them into a requirements.txt
file, and then
use pip install -r requirements.txt
.