The Problem
Suppose I have a simple CRUD web application. The application is containerized and developed locally and set up with main/staging/develop branches on Github. CI/CD is configured with Github actions. Merges to main trigger a deployment to AWS App Runner.
Generally, we need two main AWS services here: Cloud Formation, ECR, and App Runner. Where does the IaC with AWS CDK belong?
Approaches
Have a separate repository for the IaC. Run this repository once to setup the App Runner service and a dedicated ECR repository. Treat the ECR repository URI as an environment variable in the application repository. On merge to main in the application repository, Github Actions rebuilds and pushes the image to the ECR repo. The App Runner service detects the new image and redeploys per this documentation.
- Pros: Separation of responsibilities between repositories. One is for infrastructure, One is for application code. CDK only runs once. Deployments are far simpler and easier to diagnose.
- Cons: Significant manual overhead. More work to change infrastructure.
AWS CDK supports declaring an App Runner service with a local docker image per this documentation. Simply create an AWS CDK project directly in the application repository. Upon merge to main, it simply re-runs the CDK with the new image.
- Pros: One repository. Almost entirely automated with no manual overhead from infrastructure/DevOps team.
- Cons: Developers may have to worry about IaC. Potential compute overhead with constant re-runs of CDK.
The Actual Questions
Which approach is best for websites/API that might rely on multiple backend services? Which one fits the best in a development culture that relies heavily on microservices? Is there another approach I'm not thinking of? Am I asking the wrong questions?
I personally prefer approach 2 because I hate manual overhead. I have less experience in microservices, so I was hoping some people with more industry experience could present some insight.
If this is the wrong place to ask this question or if I need to be more specific, please comment below and I'll adjust accordingly.
CodePudding user response:
Have a separate repository for the IaC
The most compelling reason to do this is to decouple CI/CD for IaC from the app repo. For example, if the application codebase is not continuously delivered, and the IaC is committed to the same repo as the codebase, you end up in a scenario where the IaC is versioned along with the code. If you deploy an old version, or a branch, does that version or branch get IaC from the same ref in the repo as the code itself? If so, you're in a position where you have to merge IaC changes across branches to make them deployable, which is a huge headache.
Most development teams as of 2022 do want to continuously deliver their code and "fail forward" rather than rolling back to old versions. In this scenario it doesn't really matter - because the main
branch of the code is also the main
branch of the infrastructure. But in this case, rolling back to an old version of the code inherently means rolling back to the matching version of the infrastructure, so you can't do it without looking very carefully at what infrastructure changes were made between the two versions and whether it's safe to roll back infra changes or not.
On the other hand, if the IaC repo is separate from the code repo, the IaC can be made to accommodate multiple versions of the code. Dependencies still exist - new features in the app that require new infrastructure are inherently dependent on the Infra as Code changes, and you don't have the shared repo to make sure those dependencies are deployed before the app.
It usually comes down to a question of ownership. If the infrastructure is primarily managed by a distinct group, then putting the infra in a separate repository makes a lot of sense because commingling infrastructure changes with code changes makes it hard for these groups to operate independently. To push out an infrastructure change from a different repo, is essentially an isolated step. To push out an infrastructure change from the same repo, requires merging a PR into the code base and deploying that. If the infra change is the only thing being deployed , that's pretty straight forward. but if the CI branch of the codebase is in a messy state then the infrastructure becomes undeployable because the code is undeployable. If the infrastructure is owned by the team whose job it is to also keep the code repo clean and deployable, then splitting the repos apart doesn't do much good.
Having spent the last 12 years or so doing DevOps, I'm pretty attracted to putting IaC in a separate repo for messy applications whose teams struggle with continuous delivery. That way when I want to make infrastructure changes, I can consider them in relative isolation and can deploy them to all environments regardless of which version of the code is deployed there. It really sucks to be trying to migrate database hosting, for example, if you need to work with the product team to get your IaC into each version of the code deployed into each environment. But it's not a free lunch - I still have to make sure to coordinate dependencies between the infra and the code, of course.
The smaller the service, the more the development team also handles IaC, and the more disciplined the development team's approach to CICD , the less it matters. If the same code goes out to dev/prod anyway and code merge and deploy is a frequent, comfortable thing, then the IaC may as well be in the app codebase. But you have to be ready to be limited to a fail-forward, continuously integrated approach, and accept that infrastructure and code deployments are coupled at the repo layer.
Most microservice dev teams tend to own their own IaC, continuously deliver their application, and put their IaC in the same repo as their code.