Django: How to keep code and database sync'ed when deploying on Heroku?-CodePudding

Let's say we have a Django project called Alpha. Developers work on Alpha on their dev environment before deploying the Django project to Heroku. The Procfile may look something like:

release: python manage.py migrate
web: python -m gunicorn wsgi:application

When a developer tries to deploy a new version of Alpha, the project is shipped altogether with its migrations files (as recommended). At the end of the deployment, Heroku will execute the release statement python manage.py migrate to make all the relevant changes to the database. Because release is part of the overall deployment process, if it fails then the new code will not be deployed.

However... While the code will be reverted back to what it was before the deployment (as expected), the potential changes to the Heroku's database are going to be permament. For example, let's assume that the new version has three new migrations:

0011
0012
0013

The first two migrations are executed correctly and the database is changed accordingly. They are also added to the database table django_migrations. The last one, however, contains an issue (e.g., it defines a constraint that the stage data don't respect).

In this scenario, the code on Heroku (both /migrations/ and models.py) and the database on Heroku are now vastly out of sync: in fact, the database reflects changes whose migrations are not even present in the code repository. This can generate all sorts of problems.

How to prevent to find oneself in this predicate and insure code and data are always 100% sync'ed on both stage and prod?

Post Scriptum

Heroku has the following snippet as part of its documentation:

Use transactions for database migrations

When performing a database migration, always use transactions. A transaction ensures that all migration operations are successful before committing changes to the database, minimizing the possibility of a failed partial migration during the release phase. If a database migration fails during the release phase (i.e., the migration command exits with a non-zero status), the new release will not be deployed. If transactions were not used, this could leave the database in a partially migrated state. We suggest using heroku run, rather than the release phase, to make schema/data corrections.

CodePudding user response：

As the documentation you quoted says, doing migrations in a transaction is a good idea. It's such a good idea that Django does this by default with databases that support it.

But this only helps for the scope of one migration: if migration 0013 fails as in your example, the changes it introduces will be rolled back. The changes introduced by migrations 0011 and 0012 will not be reversed.

To reverse migrations 0011 and 0012 you'd have to roll back manually, e.g.

python manage.py migrate myapp 0010

But since your application will be running the previous release if your deployment fails, you can't simply heroku run that—Heroku doesn't have the migration files for 0011 and 0012.

This is likely, at least in part, why the Heroku documentation says

We suggest using heroku run, rather than the release phase, to make schema/data corrections.

If you remove your migration release phase command you could start deploying like so:

git push heroku main
heroku run python manage.py migrate

If the migration fails, simply reverse the last two migrations and then roll back to the previous release:

heroku run python manage.py migrate myapp 0010
heroku releases:rollback

As an extra precaution, consider putting your app into maintenance mode before you start your upgrade. This will prevent users from interacting with your site while it is potentially in an inconsistent state, e.g. after you deploy but before you apply your migrations.