I'm building an Docker image with big files (>1.0GB) and small python scripts. Big files are rarely changed, so I want to caching it.
The directory is looks like:
- app/
- main.py
- modules/
- foo.py
- bar.py
- big_files/
- bigone.tar
- bigtwo.tar
My first Dockerfile:
FROM python3:latest
COPY ./app /opt/app
When I update python scripts, it have to COPY all files which consume a long time.
What I want to acheive:
FROM python3:latest
COPY ./app/big_files /opt/app/big_files
COPY ./app /opt/app
However, it also copy big files too.
How to COPY in two step for caching?
CodePudding user response:
You should change your app's structure so that the big files are outside the /app folder.
If you don't want to do that, then you have to explicitly copy the files under /app:
FROM python3:latest
COPY ./app/big_files /opt/app/big_files
COPY ./app/*.py /opt/app/
COPY ./app/modules /opt/app/
CodePudding user response:
Each step is happening in its own layer. In your case that means
- Pull base image
- Copy Big Files
- Copy ./app
Everything must be run on the initial build.
With a setup similar to yours:
$ tree .
.
├── Dockerfile
└── app
├── bigFiles
│ └── bigFile
└── smallFile
and a dockerfile:
FROM python:latest
COPY ./app/bigFiles .
COPY ./app .
RUN echo 'Done'
Now if you run the exact same build again and nothing has changed, each layer can be reused:
$ docker build -t stackoverflowtest .
[ ] Building 1.4s (10/10) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 36B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:latest 1.3s
=> [auth] library/python:pull token for registry-1.docker.io 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 124B 0.0s
=> [1/4] FROM docker.io/library/python:latest@sha256:e9c35537103a2801a30b15a77d4a56b35532c964489b125ec1ff24f3d5b53409 0.0s
=> CACHED [2/4] COPY ./app/bigFiles . 0.0s
=> CACHED [3/4] COPY ./app . 0.0s
=> CACHED [4/4] RUN echo 'Done' 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:a78f5f730a893ccaaff102a7fe179461ec15c53bcbac926cb818e95d9d012875 0.0s
=> => naming to docker.io/library/stackoverflowtest
If you touch the big files, that layer has to be rebuilt:
$ echo "Change me" > app/bigFiles/bigFile
$ docker build -t stackoverflowtest .
[ ] Building 0.9s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 36B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:latest 0.6s
=> [internal] load build context 0.0s
=> => transferring context: 146B 0.0s
=> CACHED [1/4] FROM docker.io/library/python:latest@sha256:e9c35537103a2801a30b15a77d4a56b35532c964489b125ec1ff24f3d5b53409 0.0s
=> [2/4] COPY ./app/bigFiles . 0.0s
=> [3/4] COPY ./app . 0.0s
=> [4/4] RUN echo 'Done' 0.2s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:912022e447d8af0d20d77bc0ea41e0e1451012e1185e97e58f13409b6542bc19 0.0s
=> => naming to docker.io/library/stackoverflowtest
You can see that in the example above, the step 2 and 3 have changed because bigFiles
is also part of app
, so no layer could be cached.
However, if you only change the small files and not big files, the big file layer can be cached:
$ echo "Change me" > app/smallFile
$ docker build -t stackoverflowtest .
[ ] Building 0.9s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 36B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:latest 0.6s
=> [1/4] FROM docker.io/library/python:latest@sha256:e9c35537103a2801a30b15a77d4a56b35532c964489b125ec1ff24f3d5b53409 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 148B 0.0s
=> CACHED [2/4] COPY ./app/bigFiles . 0.0s
=> [3/4] COPY ./app . 0.0s
=> [4/4] RUN echo 'Done' 0.1s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:fdb687373689497d81bd6a14d4c30c6586b613b25463cdf5b387475700ccfb07 0.0s
=> => naming to docker.io/library/stackoverflowtest