Home > OS >  Could not load dynamic library 'libcudart.so.11.0' inside docker container
Could not load dynamic library 'libcudart.so.11.0' inside docker container

Time:10-26

I have created a docker image with the following Dockerfile

FROM python:latest
WORKDIR /root/my_dir
COPY requirements.txt ./
RUN apt-get update &&\
        apt-get upgrade -y &&\
        apt-get install -y curl &&\
        apt-get install nano &&\
        pip3 install -r requirements.txt &&\

and then I use it as service describing it in the docker-compose.yaml

version: '2.2'

services:
  my_service:
    image: my_image
    volumes:
      - /root/my_dir:/root/my_dir
      - /usr/local/cuda-11.0:/usr/local/cuda-11.0
    environment:
      - LD_LIBRARY_PATH=/usr/local/cuda-11.0/targets/x86_64-linux/lib
    command: ["python3"]
    stdin_open: true
    tty: true
  
  cuda:
    image: nvidia/cuda:11.0.3-devel-ubuntu16.04
    runtime: nvidia
    environment:
      - NVIDIA_VISIBLE_DEVICES=all
    command: nvidia-smi
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 0
              capabilities: [gpu]
              driver: nvidia

Versions:

Docker version 20.10.7, build f0df350
docker-compose version 1.29.0, build 07737305
docker-ce 5:20.10.7~3-0~ubuntu-xenial                  
docker-ce-cli 5:20.10.7~3-0~ubuntu-xenial
docker-ce-rootless-extras 5:20.10.7~3-0~ubuntu-xenial
docker-scan-plugin 0.8.0~ubuntu-xenial
nvidia-docker2 2.11.0-1

On my machine, I have installed cuda 11.0, and I use python 3.8 and tensorflow 2.4.0 as stated here

https://www.tensorflow.org/install/source#gpu

I run the container with:
docker-compose up
all seems go well, but then when I attach to the container, I try to import tensorflow with python, but it gives me
Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory;

I also tried to set the same env variables as outside the container

LD_LIBRARY_PATH=/usr/local/cuda-11.0/lib64
CUDA_HOME=/usr/local/cuda-11.0
PATH=$PATH:/usr/local/cuda-11.0/bin

but it has any effect.

I also tried several cuda docker images, nothing to do.

CodePudding user response:

The problem is that you created in the docker composed two different services that cannot communicate with each other.
The best thing to do is to create an image with CUDA in which you install python and the requirememnts.txt.\

This is a sample of docker file:

FROM nvidia/cuda:11.2.0-cudnn8-runtime-ubuntu20.04
WORKDIR /root/my_dir
COPY requirements.txt ./

RUN apt-get update && apt-get upgrade -y &&\
    apt-get install -y python3 python3-pip nano

RUN pip3 install --upgrade pip && pip3 install -r requirements.txt

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES compute,utility

And this is the docker compose:

version: '2.2'

services:
  my_service:
    build: .
    volumes:
      - /root/my_dir:/root/my_dir
    command: ["bash"]
    stdin_open: true
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 1
              capabilities: [gpu]
              driver: nvidia
  • Related