I'm currently building a docker image that can be used to deploy a deep learning application. The image is fairly large with a size roughly 6GB. Since the deployment time is affected by the size of docker container, I wonder if there are some of best practices to reduce the image size of ml-related applications.
CodePudding user response:
First, keep the data (if any) apart from the image (in volumes for example).Also, use .dockerignore to ignore files you don't want in your image.
Now some techniques:
A first technique is to use multistage builds. For example, an image just to install dependencies and another image that starts from the first one and run the app.
A second technique is to minimize the number of image layers. Each RUN , COPY and FROM command creates a different layer. Try to combine commands in a single one using linux operators (like &&).
A third technique is to take profit of the caching in docker image builds. Run every command you can before copying the actual content into the image. For exemple, for a python app, you might install dependencies before copying the contents of the app inside the image.