Home > Net >  Install mongo CLI in Dockerfile for apache/airflow docker image
Install mongo CLI in Dockerfile for apache/airflow docker image

Time:07-29

FROM apache/airflow:2.2.4

# install mongodb-org-tools - mongodb tools for up-to-date mongodb that can handle --uri=mongodb srv: flag
RUN apt-get update && apt-get install -y gnupg software-properties-common && \
    curl -fsSL https://www.mongodb.org/static/pgp/server-4.2.asc | apt-key add - && \
    add-apt-repository 'deb https://repo.mongodb.org/apt/debian buster/mongodb-org/4.2 main' && \
    apt-get update && \
    apt-get install -y mongodb-org-tools

ADD requirements.txt /requirements.txt
RUN pip install -r /requirements.txt

We need to be able to use mongoDB CLI commands such as mongoimport, mongoexport in BashOperator in our airflow project, as our workflow involves moving data into a MongoDB database. We have a strong preference for using mongo commands like mongoimport over the python pymongo package.

When we build the image, it seems we do not have permission to install mongo - we receive the following error:

=> ERROR [cbb-airflow_airflow-webserver 2/4] RUN apt-get update && apt-get install -y gnupg software-properties-common &&     curl -fsSL https://www.  0.6s
------
 > [cbb-airflow_airflow-webserver 2/4] RUN apt-get update && apt-get install -y gnupg software-properties-common &&     curl -fsSL https://www.mongodb.org/static/pgp/server-4.2.asc | apt-key add - &&     add-apt-repository 'deb https://repo.mongodb.org/apt/debian buster/mongodb-org/4.2 main' &&     apt-get update &&     apt-get install -y mongodb-org-tools:
#0 0.460 Reading package lists...
#0 0.592 E: List directory /var/lib/apt/lists/partial is missing. - Acquire (13: Permission denied)
------
failed to solve: executor failed running [/bin/bash -o pipefail -o errexit -o nounset -o nolog -c apt-get update && apt-get install -y gnupg software-properties-common &&     curl -fsSL https://www.mongodb.org/static/pgp/server-4.2.asc | apt-key add - &&     add-apt-repository 'deb https://repo.mongodb.org/apt/debian buster/mongodb-org/4.2 main' &&     apt-get update &&     apt-get install -y mongodb-org-tools]: exit code: 100

What is the best way to install mongo CLI for commands like mongoimport using the official apache/airflow docker image?

CodePudding user response:

Add USER root after the FROM statement.

Updated Dockerfile will look like this:

FROM apache/airflow:2.2.4

USER root

# install mongodb-org-tools - mongodb tools for up-to-date mongodb that can handle --uri=mongodb srv: flag
RUN apt-get update && apt-get install -y gnupg software-properties-common && \
    curl -fsSL https://www.mongodb.org/static/pgp/server-4.2.asc | apt-key add - && \
    add-apt-repository 'deb https://repo.mongodb.org/apt/debian buster/mongodb-org/4.2 main' && \
    apt-get update && \
    apt-get install -y mongodb-org-tools

ADD requirements.txt /requirements.txt
RUN pip install -r /requirements.txt

TL;DR

The user is set to airflow (id 5000) in the apache/airflow:2.2.4 Docker image. We can confirm this by looking at the 49th instruction in the Dockerfile here.

Now when you try to run any command, it will run using the airflow user which has restricted access.

To overcome this problem, you need to explicitly switch to the root user while building the Docker image. This will resolve all the permission-related issues.

  • Related