This is more of a design question than anything else.
Until recently I have been using Django with SQLite in my development environment, but I have now changed to PostgreSQL for production. My app is deployed with Heroku, and after some days I realized that they do random maintenance to the DB and it goes down during a few minutes.
For example, having a model with 3 tables, one Procedure
which each of them point to a ProcedureList
, and a ProcedureList
can have more than one Procedure
. A ProcedureUser
which links a ProcedureList
and a user and sets some specific variables for the user on that ProcedureList
. Finally there is a ProcedureState
which links a Procedure
with its state for an specific user.
On my app, in one of the views I have a function that modifies the DB in the following way:
user = request.user
plist = ProcedureList.objects.get(id=idFromUrl)
procedures = Procedure.objects.filter(ProcedureList=pList)
pUser = ProcedureUser(plist, user, someVariables)
pUser.save()
for procedure in procedures:
pState = ProcedureState(plist, user, pUser, procedure, otherVariables)
pState.save()
So what I'm thinking now, is that if Heroku decides to go into maintenance between those object.save()
calls, we will have a problem. The later calls to .save()
will fail and the DB will be corrupted. The request by the user will of course fail and there will be no way to rollback the previous insertions, because the connection with the DB is not possible.
My question is, in case of a DB fail (given by Heroku maintenance, network error or whatever), how are we supposed to correctly rollback the DB? Shall we make a list of insertions and wait for DB to go up again to roll them back?
I am using Python 3 and Django 4 but I think this is more of a general question than specific to any platform.
CodePudding user response:
in case of a DB fail (given by Heroku maintenance, network error or whatever), how are we supposed to correctly rollback the DB?
This is solved by databases through atomic transactions [wiki]. An atomic transaction is a set of queries that are committed all or none. It is thus not possible that for such transaction, certain queries are applied whereas others are not.
Django offers a transaction
context manager [Django-doc] to perform work in a transaction:
from django.db import transaction
with transaction.atomic():
user = request.user
plist = ProcedureList.objects.get(id=idFromUrl)
procedures = Procedure.objects.filter(ProcedureList=pList)
pUser = ProcedureUser(plist, user, someVariables)
pUser.save()
ProcedureState.objects.bulk_create([
ProcedureState(plist, user, pUser, procedure, otherVariables)
for procedure in procedures
])
At the end of the context block, it will commit the changes. This means that if the database fails in between, the actions will not be committed, and the block will raise an exception (usually an IntegrityError
).
Note: Django has a
.bulk_create(…)
method [Django-doc] to create multiple items with a single database query, minimizing the bandwidth between the database and the application layer. This will usually outperform creating items in a loop.