Home > Enterprise >  Postgres empty directories and GitHub
Postgres empty directories and GitHub

Time:12-26

EDIT: Some context was missing from the original question. My goal in using Git was to share the table structure and stored procedures of a mostly empty database under development with a second person building a REST API server

Git does not track empty directories, but PostgreSQL requires them at startup. How can I get the two working together?

Every search I make on the topic brings me to some variation of a Git workaround that involves adding dummy files to directories that would otherwise be empty. However, PostgreSQL did not appreciate that solution in at least the pg_tblspc directory.

The directories required by PostgreSQL at startup are:

  • pg_notify
  • pg_tblspc
  • pg_replslot
  • pg_twophase
  • pg_stat
  • pg_snapshots
  • pg_commit_ts
  • pg_logical/mappings
  • pg_logical/snapshots

I am running PostgreSQL from a Docker container as follows:

docker-compose.yml

version: "3.9"
services:
  db:
    build: ./pg_db/
    image: db_MyDB
    ports:
      - "5432:5432"
    volumes:
      - ./pg_db/db_data:/var/lib/postgresql/data 

./pg_db/Dockerfile

FROM postgres:14

ENV POSTGRES_PASSWORD=admin
ENV POSTGRES_USER=<admin password>
ENV POSTGRES_DB=db_MyDB

Then in the CLI...

git add .
git commit -m "something"
git push -u origin HEAD

Later, if I or another user then try to replicate the database on GitHub from the CLI...

git clone https://github.com/<user>/<project>.git
cd <project>
docker-compose -up

The result is a set of startup failures related to the missing, empty, directories listed above. Once I manually create them, PostgreSQL starts up, without any issues to my knowledge.

CodePudding user response:

Thank you to Richard and Adrian for responding so promptly.

EDIT: Adrian highlights in a comment below that Git is a strong compliment to a Schema Management Tool

Git is the wrong tool. What I need is a Schema Management Tool several of which are highlighted in the PostgreSQL wiki here: https://wiki.postgresql.org/wiki/Change_management_tools_and_techniques

Some context was missing from the original question.

My goal in using Git was to share the table structure and stored procedures of a mostly empty database under development with a second person building a REST API server. My approach of sharing the entire data directory caused some confusion and in hindsight I can see how pushing likely stale data around with schema changes would would be a terrible idea in most scenario's.

I have what I need, but if anyone would like to provide a more inciteful answer for anyone else who might wind up here in a search I'd be happy to accept your answer as the correct one.

CodePudding user response:

Generally speaking it is best to keep the schema object definitions in text files external to the database in some sort of change management framework. This can either be a home grown solution or one of the many existing solutions as shown in your link Change Management. In either case keeping this under version control e.g. Git is an extra layer of flexibility and redundancy.

  • Related