I have created a docker image with the following Dockerfile
FROM python:latest
WORKDIR /root/my_dir
COPY requirements.txt ./
RUN apt-get update &&\
apt-get upgrade -y &&\
apt-get install -y curl &&\
apt-get install nano &&\
pip3 install -r requirements.txt &&\
and then I use it as service describing it in the docker-compose.yaml
version: '2.2'
services:
my_service:
image: my_image
volumes:
- /root/my_dir:/root/my_dir
- /usr/local/cuda-11.0:/usr/local/cuda-11.0
environment:
- LD_LIBRARY_PATH=/usr/local/cuda-11.0/targets/x86_64-linux/lib
command: ["python3"]
stdin_open: true
tty: true
cuda:
image: nvidia/cuda:11.0.3-devel-ubuntu16.04
runtime: nvidia
environment:
- NVIDIA_VISIBLE_DEVICES=all
command: nvidia-smi
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 0
capabilities: [gpu]
driver: nvidia
Versions:
Docker version 20.10.7, build f0df350
docker-compose version 1.29.0, build 07737305
docker-ce 5:20.10.7~3-0~ubuntu-xenial
docker-ce-cli 5:20.10.7~3-0~ubuntu-xenial
docker-ce-rootless-extras 5:20.10.7~3-0~ubuntu-xenial
docker-scan-plugin 0.8.0~ubuntu-xenial
nvidia-docker2 2.11.0-1
On my machine, I have installed cuda 11.0, and I use python 3.8 and tensorflow 2.4.0 as stated here
https://www.tensorflow.org/install/source#gpu
I run the container with:
docker-compose up
all seems go well, but then when I attach to the container, I try to import tensorflow with python, but it gives me
Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory;
I also tried to set the same env variables as outside the container
LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64
CUDA_HOME=/usr/local/cuda-11.0
PATH=$PATH:/usr/local/cuda-11.0/bin
but it has any effect.
I also tried several cuda docker images, nothing to do.
CodePudding user response:
The problem is that you created in the docker composed two different services that cannot communicate with each other.
The best thing to do is to create an image with CUDA in which you install python and the requirememnts.txt.\
This is a sample of docker file:
FROM nvidia/cuda:11.2.0-cudnn8-runtime-ubuntu20.04
WORKDIR /root/my_dir
COPY requirements.txt ./
RUN apt-get update && apt-get upgrade -y &&\
apt-get install -y python3 python3-pip nano
RUN pip3 install --upgrade pip && pip3 install -r requirements.txt
ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility
And this is the docker compose:
version: '2.2'
services:
my_service:
build: .
volumes:
- /root/my_dir:/root/my_dir
command: ["bash"]
stdin_open: true
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
driver: nvidia