I believe this question already shows that I am new to docker and alembic. I am building a flask sqlalchemy app using docker and postgres. So far I am not using alembic, but I am about to plug it in and some questions came up. I will have to create a pg_trgm extension and also populate one of the tables with data I already have. Until now I have only created brand new databases using sqlalchemy for the tests. So here is what I am thinking/doing:
To create the extension I could simple add a volume to the postgres docker service like: ./pg_dump.sql:/docker-entrypoint-initdb.d/pg_dump.sql. The extension does not depend on any specific db, so a simple "CREATE EXTENSION IF NOT EXISTS pg_trgm WITH SCHEMA public;" would do it, right?
If I use the same strategy to populate the tables I need a pg_dump.sql that creates the complete db and tables. To accomplish that I first created the brand new database on sqlalchemy, then I used a script to populate the tables with data I have on a json file. I then generated the complete pg_dump.sql and now I can place this complete .sql file on the docker service volume and when I run my docker-compose the postgres container will have the dabatase ready to go.
Now I am starting with alembic and I am thinking I could just keep the pg_dump.sql to create the extensions, and have a alembic migration script to populate the empty tables (dropping the item 2 above).
Which way is the better way? 2, 3 or none of them? tks
CodePudding user response:
Create the extension in a /docker-entrypoint-initdb.d
script (1). Load the data using your application's migration system (3).
Mechanically, one good reason to do this is that the database init scripts only run the very first time you create a database container on a given storage. If you add a column to a table and need to run migrations, the init-script sequence requires you to completely throw away and recreate the database.
Philosophically, I'd give you the same answer whether you were using Docker or something else. You could imagine running a database on a dedicated server, or using a cloud-hosted database. You'd have to ask your database administrator to install the extension for you, but they'd generally expect to give you credentials to an empty database and have you load the data yourself; or in a cloud setup you could imagine checking a "install this extension" checkbox in their console but there wouldn't be a way to load the data without connecting to the database remotely.
So, a migration system will work anywhere you have access to the database, and will allow incremental changes to the schema. The init script setup is Docker-specific and requires deleting the database to make any change.