Home > Blockchain >  Docker compose with conda enviroment and postresql database - could not connect to server: Connectio
Docker compose with conda enviroment and postresql database - could not connect to server: Connectio

Time:11-15

Im trying to dockerize a python aplication that runs on a conda environment and querys a postgresql database using sqlalchemy. Ive managed to run a dockerize my conda env and run jupyter on it, ive also managed to run a postgres image separately and ive built a docker-compose.yml with both services, but i cant get them to work together.

Im running the following code on a jupyter notebook with the conda env installed and im getting (psycopg2.OperationalError) could not connect to server: Connection refused.

import sqlalchemy
engine = sqlalchemy.create_engine('postgres://admin:admin@localhost:5432/postgresdb')
engine.connect()

Im quite new to docker, so im sure im getting some configuration wrong, any help would be great.

Configuration

  • Docker compose:
version: "3"

services:
  database:
    container_name: postgres
    image: 'bitnami/postgresql:12'
    environment:
      - POSTGRES_USER=admin
      - POSTGRES_PASSWORD=admin
      - POSTGRES_HOST=localhost
      - POSTGRES_PORT=5432
      - POSTGRES_DB=postgresdb
    ports:
      - "5432:5432"
    volumes:
      - db_volume:/var/lib/postgresql

  jupyter-conda:
    build: .
    ports:
      - "8888:8888"
    depends_on:
      - database

volumes:
  db_volume:
  • Dockerfile:
FROM ubuntu

SHELL [ "/bin/bash", "--login", "-c" ]

# Create a non-root user
ENV USER arquimedes
ENV UID 1000
ENV GID 1000
ENV HOME /home/$USER
RUN adduser --disabled-password \
    --gecos "Non-root user" \
    --uid $UID \
    --home $HOME \
    $USER

RUN apt-get update
RUN apt-get install wget -y

COPY environment.yml requirements.txt /tmp/
RUN chown $UID:$GID /tmp/environment.yml /tmp/requirements.txt

COPY postBuild /usr/local/bin/postBuild.sh
RUN chown $UID:$GID /usr/local/bin/postBuild.sh && \
    chmod u x /usr/local/bin/postBuild.sh

COPY entrypoint.sh /usr/local/bin/entrypoint.sh
RUN chown $UID:$GID /usr/local/bin/entrypoint.sh && \
    chmod u x /usr/local/bin/entrypoint.sh

USER $USER
# install miniconda
ENV MINICONDA_VERSION latest
ENV CONDA_DIR $HOME/miniconda3
RUN wget https://repo.anaconda.com/miniconda/Miniconda3-$MINICONDA_VERSION-Linux-x86_64.sh -O ~/miniconda.sh && \
    chmod  x ~/miniconda.sh && \
    ~/miniconda.sh -b -p $CONDA_DIR && \
    rm ~/miniconda.sh
# make non-activate conda commands available
ENV PATH=$CONDA_DIR/bin:$PATH
# make conda activate command available from /bin/bash --login shells
RUN echo ". $CONDA_DIR/etc/profile.d/conda.sh" >> ~/.profile
# make conda activate command available from /bin/bash --interative shells
RUN conda init bash

# create a project directory inside user home
ENV PROJECT_DIR $HOME/conda-oracle
RUN mkdir $PROJECT_DIR
WORKDIR $PROJECT_DIR

# build the conda environment
ENV ENV_PREFIX $PROJECT_DIR/env
RUN conda update --name base --channel defaults conda && \
    conda env create --prefix $ENV_PREFIX --file /tmp/environment.yml
# run the postBuild script to install any JupyterLab extensions
RUN conda activate $ENV_PREFIX && \
    /usr/local/bin/postBuild.sh && \
    conda deactivate

ENTRYPOINT [ "/usr/local/bin/entrypoint.sh" ]

# expose port the api
EXPOSE 8888

# password for jupyter
ENV JUPYTER_TOKEN eureka

# copy source files 
COPY --chown=$UID:$GID /src $PROJECT_DIR/src

# inicializar el container con jupyter
CMD [ "jupyter", "lab", "--no-browser", "--ip", "0.0.0.0" ]

docker-container up output

database-output

EDIT: Part 2

So, ive been able to connect to the database now thanks to jabbsons answer, but im now having trouble interacting with the db. Im trying to pusblish a database from pandas and then query it, im able to pusblish but get error querying, anybody has any idea what im doing wrong? Thank you so much for your time! Code bellow.

import pandas as pd
from datetime import date
from typing import Dict, Optional, Any
from sqlalchemy import create_engine
from sqlalchemy.types import CHAR, INTEGER, DATE

CREDENTIALS = 'postgresql://admin:admin@database:5432/postgresdb' 

def create_table_from_pandas(df: pd.DataFrame, table_name: str, schema: Dict[str, Any], chunksize: int = None) -> None:
    conection = create_engine(CREDENTIALS)

    # if_exists='replace' borra primero todo lo previo que haya en la tabla temporal
    df.to_sql(
        name=table_name,
        con=conection,
        if_exists="replace",
        index=False,
        dtype=schema,
        chunksize=chunksize,
    )

def execute_query(query: str, params: Optional[Dict[str, str]] = None) -> pd.DataFrame:
    conection = create_engine(CREDENTIALS)

    df_result = pd.read_sql(query, params=params, con=conection)

    return df_result

df_test = pd.DataFrame({'COL1':[1,2,3,4,5], 'COL2':['a','b','c','d','e'], 'COL3':[date.today()]*5})

df_scheema = {
    "COL1": INTEGER(),
    "COL2": CHAR(20),
    "COL3": DATE(),
}

create_table_from_pandas(df_test, 'DFTEST', df_scheema)

Everything fine untill this point, then I run:

execute_query("SELECT * FROM DFTEST")

And get: UndefinedTable: relation "dftest" does not exist

If I run

execute_query("SELECT * FROM information_schema.tables WHERE table_name = 'DFTEST'")

I can see the table screenchot pandas table

Again, thanks so much for your time, Im still pretty new to all this.

CodePudding user response:

The issue you are connecting to the localhost from your jupyter notebook. If you want to connect to another service, use its name:

postgres://admin:admin@database:5432/postgresdb

For the second question, run

execute_query("SELECT * FROM \"DFTEST\"")

or

execute_query('SELECT * FROM "DFTEST"')

more details here

CodePudding user response:

jabbsons answer solved both problems, in addition, creating the table in lowwercase also solves the query problem.

  • Related