This is based on the answer I was given at Executing an R file in one container while mounting a volume containing packages. Please note that I am not using docker-compose, but I intend on moving to that once I get this working via the CLI.
Files and Folders
I have the following folder setup:
Code
|--- Dockerfile
|--- iris.R
R
|--- Dockerfile
|--- packages.R
|--- packages.txt
Results (an empty folder)
R/Dockerfile:
#### INITIAL SETUP
FROM r-base:4.2.2
# Copy packages.txt
COPY . /home/
# Define the working directory
WORKDIR /home/
# Execute R from the terminal
CMD ["Rscript", "./packages.R"]
R/packages.R:
install.packages(readLines('packages.txt'))
R/packages.txt:
ggplot2
Code/iris.R:
library(ggplot2)
args <- commandArgs(trailingOnly = TRUE)
p <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width,
color = Species))
geom_point()
geom_smooth(method = "lm")
theme_bw()
theme(panel.grid = element_blank())
xlab("Sepal Length")
ylab("Sepal Width")
scale_x_continuous(limits = c(4, 8), breaks = seq(4, 8, 0.5))
scale_y_continuous(limits = c(2, 5), breaks = 2:5)
Code/Dockerfile:
#### INITIAL SETUP
FROM package:first
# Set an environment variable at runtime for the separate directory
ENV MAINDIR /home/
# Copy the file to a path in the container
COPY . ${MAINDIR}
# Set working directory
WORKDIR $MAINDIR
CLI Commands
In the R folder:
docker build -t package:first .
docker run -t package:first
In the Code folder:
docker build -t test .
docker run -t test
What I'm trying to do
The code runs above as expected, without issues. However, what I want to happen is for Rscript iris.R
to be executed in the CLI after running the Docker folder.
Things I have tried
- Adding the following line to Code/Dockerfile:
CMD ["Rscript", "iris.R"]
When I execute the following in the Code folder, the CMD ["Rscript", "./packages.R"]
from Code/Dockerfile clearly gets ignored, because ggplot2
is not installed.
docker build -t test .
docker run -t test
Sending build context to Docker daemon 3.072kB
Step 1/5 : FROM package:first
---> 570ce8dfefc7
Step 2/5 : ENV MAINDIR /home/
---> Using cache
---> 204f133ca09b
Step 3/5 : COPY . ${MAINDIR}
---> 99e453b90204
Step 4/5 : WORKDIR $MAINDIR
---> Running in 1a6be0dfb5bb
Removing intermediate container 1a6be0dfb5bb
---> 1582e287cdcd
Step 5/5 : CMD ["Rscript", "iris.R"]
---> Running in e6a843cf9a2e
Removing intermediate container e6a843cf9a2e
---> b57708b42a89
Successfully built b57708b42a89
Successfully tagged test:latest
Error in library(ggplot2) : there is no package called ‘ggplot2’
Execution halted
- Adding the following line to Code/Dockerfile:
ENTRYPOINT ["Rscript"]
I think I can guess why this doesn't work, as I think it would just ignore CMD ["Rscript", "./packages.R"]
and then append itself to whatever is in docker run
:
docker run -t test ./iris.R
Error in library(ggplot2) : there is no package called ‘ggplot2’
Execution halted
- Keeping Code/Dockerfile as above and then running
docker run -t test Rscript ./iris.R
Unsurprisingly, this gives the same outcome as my second attempt.
How do I go about approaching this?
CodePudding user response:
In your base image, you set the CMD
to run the package installation script. A container only runs one CMD
, though, so when the derived image declares its own CMD
it overrides the base image's CMD
, and the installation script never gets run.
You most likely want the library dependencies to be built into the (base) image, and not reinstalled every time when you run the container. Changing CMD
to RUN
will do this. (It's legal to keep the JSON-array syntax if it simplifies things, but it's a little less common for RUN
lines.)
# base image R/Dockerfile
RUN Rscript ./packages.R # not CMD
This leaves the base image Dockerfile without a CMD
. This is fine; it will inherit the CMD
from its base image (here it will just run R
). You do usually want a CMD
in a final image containing the application or script you actually want to run (for example, your iris.R
).