I am trying to create a very simple Docker application that includes a persistent SQL database and a (R) script that adds data to the database at regular intervals (but can be run more often, if needed). I am new to multi-container applications, but it's simple enough to run on-demand: I have an SQL database container and the (R-based) script container that connects to it. But what is the best way to schedule calls to the script?
I could create a cron job inside the container housing the script and run it repeatedly that way, but this feels like it might be violating the "one process per container" principle, and it wouldn't be easily scalable if I made things more complex. I could also run cron on my host, but that feels wrong. Other websites like this suggest creating a separate, persistent container just to coordinate cron jobs.
But if the job that I want run is IN a dockerized container itself, what's the best way to accomplish this? Can a cron container issue a docker run
command on the "sleeping" script container? If possible, I assume it's best to only have containers running when you actually need them.
Lastly, would this all be able to be written into a docker-compose file?
I've successfully used a cron scheduler INSIDE the container housing the R script and OUTSIDE of all the containers as part of my host's crontab, but my research suggests these are bad ways to do it.
CodePudding user response:
I also think that scheduled jobs should run in their own containers if possible.
You can make a super simple container scheduler using docker-in-docker and cron like this:
Dockerfile
FROM docker:dind
COPY crontab .
CMD crontab crontab && crond -f
example crontab file
* * * * * docker run -d hello-world
build and run with
docker build -t scheduler .
docker run -d -v /var/run/docker.sock:/var/run/docker.sock scheduler
It'll then run the hello-world image once every minute.