TL;DR
What is the optimal flow / best practice to run migrations in a CI/CD pipeline against a database without public endpoint?
GH actions -> connect to remote db (somehow) -> run migration on said db
and how to rollback when deployment failed.
Problem
I need to setup a flow to deploy an application and it's database migrations accordingly.
The main issue I come across is that it's best practice to set the database to run without having a public endpoint, in a private VPC that matches the application service.
But how does one run migrations from a CI/CD pipeline in this case?
Current scenario
The stack here is nodejs
, typeorm
, Elastic Beanstalk (EBS) & AWS.
- Build docker image application code & push to private ECR (
app/dev-api:latest
) - Build separate docker that packages the migrations and push it to a separate private ECR.
dbmigrations/dev:latest
- As soon as the migrations image is pushed to
dbmigrations/dev:latest
, afargate
service boots up that sits in the same VPC as the RDS and runs the migrations. - If that
fargate
task runned succesfully, deploy applicationcode to Elastic Beanstalk- a) If something goes wrong during deploy to EBS
- build new docker image
- push to another ECR that contains "rollback" migration
- boot up another fargate service
- b) Exit if all tasks ran succesfully -> Deploy Succesful.
- a) If something goes wrong during deploy to EBS
Next scenario
Now I'm using following stack: nodejs
, prisma
and "App Runner", "RDS" on AWS.
I would like to run the database in a private VPC still, but I'm not sure how I would run prisma
migrations against a private database.
Also I would think there is a simpler solution than to run separate docker containers to run your database migrations, as this can cause a miss-sync between the deployed application and the database.
I know there are tools like liquibase
& flyway
but both are paid (I think), and since prisma
comes with a migration flow itself, I don't see why I would need yet another migration tool to do such task.
Thanks in advance!
PS: I'm using Github Actions, but I'm more looking to a general flow. I'm not looking for a code example (as I'm sure this will apply to other pipeline services as well)
CodePudding user response:
It's true that you shouldn't expose public IP's on databases or instances that do not require to be accessible from the internet. But such servers should still be accessible by developers for maintenance/debugging etc.
Here comes a Bastion host, which is a central secure proxy that allows you to connect to private servers. You could open a tunnel through Bastion to RDS and run migration.
On AWS there is also newer safer solution (you still need bastion though), called Session Manager that allows you to connect to servers more securely https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager.html