I'm trying to crawl a static html site via HTTP as part of building a docker image. When I try to curl
http://localhost I get connection refused. When I remove the curl
statement and run the container the static content is available at localhost as expected (including running curl
inside the container).
Is there any way to access an nginx endpoint during a build?
FROM ubuntu AS indexer
RUN apt-get update && apt-get install -y nginx
RUN apt-get update && apt-get install -y curl
COPY --from=builder /workdir/build /usr/share/nginx/html
RUN service nginx start
RUN curl http://localhost > /tmp/index.html
I've tried waiting for port 80 to be available before I run curl
but it makes no difference.
CodePudding user response:
Between each RUN statement, the state of the build 'machine' is destroyed and a new one is started for the next RUN statement. So when you start nginx in one RUN statement, it's gone when you reach the next one.
To do what you're trying to do, nginx needs to be started in the same RUN statement as you do your crawling.
Something like this
FROM ubuntu AS indexer
RUN apt-get update && apt-get install -y nginx
RUN apt-get update && apt-get install -y curl
COPY --from=builder /workdir/build /usr/share/nginx/html
RUN service nginx start && sleep 10 && curl http://localhost > /tmp/index.html
It does seem a little weird that you access the files through nginx rather than the file system. In the (admittedly limited) example you've posted, you could just do
RUN cp /usr/share/nginx/html/index.html /tmp/
instead.