Home > OS >  Should I commit Postgresql data into Github?
Should I commit Postgresql data into Github?

Time:01-20

When dockerizing my app, a data folder was created with postgresql elements and data. It comes from this docker-compose.yml snippet:

volumes:
      - ./data/db:/var/lib/postgresql/data

This folder is being populated with data constantly from running the app (still not in production but running in docker compose), which means that every time I commit any changes I do to the app code, also various data files are also created. Should I include that folder into .gitignore or I should really commit them?

CodePudding user response:

You should include this directory in your .gitignore file (and, if appropriate, .dockerignore as well).

Since the PostgreSQL data is opaque binary files, it's not something that can be stored well in source control. The database storage is also specific to your instance of the application. If you and a colleague are both working on the same application, and they commit changed database files to source control, your only real choices are to overwrite your local data with theirs, or to ignore their changes.

Most application frameworks have two key features to support working with databases. You can generally run migrations, which generally do things like create tables and indexes, and can be run incrementally to update an older database to a newer schema. You can also supply seed data to initialize a developer's database. Both of these things are in plain text, often SQL files or CSV-format data, and these can be committed to source control.

In a Docker context, you have one more option. Since you can't usefully read the opaque database data and you don't want it committed to source control, you can ask Docker to manage the storage in a named volume. On some platforms this can also be noticeably faster; but if you do need to back up the data or otherwise interact with it as files and not via the database layer, it can be harder.

version: '3.8'
services:
  database:
    image: postgres
    volumes:
      - 'database_data:/var/lib/postgresql/data'
      #  ^^^^^^^^^^^^^ volume name, not directory path
volumes:
  database_data:
    # empty, but must be declared
  • Related