I'm creating a CI workflow for a 50 Looker developer team. I have read all models in this page but none really fits my requirements. The current flow I have in mind is:
- The github repo has three branches:
dev
,stg
andprod
. We also have three Looker instances that connect to one of them; dev
is essentially a volatile sandbox and can be re-created rather quickly fromprod
if any critical file has been deleted;- We cannot afford a release branch method because merging the release branch into
prod
is a nightmare. Basically we cannot afford any model that requires a big merge from somewhere tostg
orprd
(in a team of 50 developers, even a daily merge is too problematic); stg
is regarded as a stable environment where developers need to prove that their code works in such an environment before being merged intoprod
, a second stable environment. A merge intostg
should be guarded by automated tests, but not by humans;prd
is final and should be guarded by a human reviewer;- Step 1: Developer A creates a feature branch from
dev
, say, A_feature_A1, eventually it gets merged intodev
; - Step 2: A checks that everything is fine in
dev
and creates a PR to merge the feature branch intostg
. This time a github action is triggered to perform some automated validation and tests; - Step 3: A checks that everything works as expected in
stg
, and creates a PR to merge the feature branch intoprd
. This time there is a human approver to approve the PR.
However I have a big issue in Step 1 -- should I create feature branches from dev
or from prd
/stg
? I have tried both and neither works perfectly. If I create feature branches from dev
, and because dev
is always messed up, it's very difficult to merge the feature branch into either of the two stable branches. If I create feature branches from prd
, Looker simply tells me that remote and local are not synced, which makes sense because dev
is always different from prd
.
How can I resolve this issue? There are a few other restriction comparing to software engineering work:
- Looker developers prefer to develop in Looker UI, which puts some restrictions on git operations (e.g. Say developers only have access to Looker-dev instance, which points to the
dev
branch, they simply CANNOT create feature branches from other branches); - The workflow has to guarantee CI/CD that is, 1) expect a lot of commits everyday and they are expected to be moved into production ASAP, and 2) expect that the
dev
branch is messed up
CodePudding user response:
You currently seem to have a flow that's similar to the flow used by the git project or a slightly modified GitLab flow model. Also, you seem to have the following constraints:
- Looker developers prefer to develop in Looker UI, which puts some restrictions on git operations (e.g. Say developers only have access to Looker-dev instance, which points to the dev branch, they simply CANNOT create feature branches from other branches);
- The workflow has to guarantee CI/CD that is, 1) expect a lot of commits everyday and they are expected to be moved into production ASAP, and 2) expect that the dev branch is messed up.
- Basically we cannot afford any model that requires a big merge from somewhere to stg or prd (in a team of 50 developers, even a daily merge is too problematic).
You also mention in one of your comments that
Looker developers face business directly and they have to deliver as quickly as possible, sometimes with dirty results.
and
However we cannot afford using a release branch because business cannot wait for say a week/sprint to see results in production.
Approach 1: Adapting your current model
What should you branch from?
I would say you should always be branching off of prd
for your feature branches and then merging into dev
when you're ready. I say this because with pull requests that are made in GitHub/GitLab/etc. you'll be merging from the common ancestor of the branch you want to merge into up to the current HEAD of the feature branch. So if I branch off of dev
or stg
and then try to merge my feature into prd
, I can potentially pull unstable or not-yet-reviewed code into prd
. There are ways around this issue when branching off of other branches such as rebasing or adopting a patch-set workflow with stacked git (if you don't want to use the traditional email workflow). However, as I've never heard of Looker and don't know its full scope of restrictions, I'm going to assume there are limitations in place if it's restricting which branches people can branch from (though I am assuming people can branch off of branches they've created).
Avoiding merge conflicts
Also, git is exceptional at performing merges. Use that to your advantage. You mention that daily merges are too problematic but also that business can't wait a week to see things in production. Taking these excerpts from your post and subsequent comments, it sounds like you're afraid of frequent merges yet still needing to meet businesses' expectations of rapid releases. As long as your dev team is communicating with one another during development and not making massive changes, git should be capable of handling X number of merges per day. If there are merge conflicts, fix them. For advice on how to avoid merge conflicts see Open Bank Project's article and Gehsan's blog post.
Modifying the existing model
If I were to modify your branching model, here's what I would have the branches as defined below.
prd
the current state of productionstg
the stable testing branch (contains all changes inprd
)dev
your primary integration branch (contains all changes inprd
)
You should be frequently overwriting dev
with prd
. This way developers are integrating into a semi-stable environment, rather than an always broken environment. Maybe start with doing this weekly and see where it goes. Yes, developers who have not had their changes graduate to the stg
branch will need to re-merge into dev
, but this should not be a major issue. Just be sure to announce when you overwrite so developers can check if their branches need to be re-merged. By doing this, it should allow developers to have a more stable integration branch.
Addressing dirty results
You mention developers sometimes have to merge with "dirty results" to please business. This is just all around a bad practice. Even dev
is intended for code that's been locally (or in a separate environment) tested by the individual developer and refactored, not code that's hot off the keyboard. If this is something business needs to see, you should have a frank conversation with them. If they're unwilling to budge, I would suggest having your developers create throwaway branches. The idea behind throwaway branches is you create them, maybe merge some other branches into them for testing, and then purge them when you're done. For example a developer working on a feature could do the following:
- Create branch
feature_A
- Develop feature
- Create
thrw_feature_A
fromfeature_A
- Optionally merge
stg
intothrw_feature_A
to ensure they're grabbing other stable updates. - Show the feature to whoever needs to see it on the business side
- Delete the
thrw_feature_A
branch and either continue developing or merge intodev
if done.
If that's not an option, you can also adopt feature flags (see the section on Trunk-based development below).
Approach 2: Trunk-based Development
I don't see in any of the links you shared any examples of trunk-based development. It's common in large companies and scales very well. Atlassian has a good article about it. In summary, it's where all changes are merged directly into prd
. Features that are not complete or buggy are ignored through the use of feature flags. In this way, small updates to the code can be made to master and then immediately tested and deployed, if stable, by your CI/CD pipeline.
With this approach everyone could simply branch off and merge into master. That would solve your issue with Looker's restrictions and keep a steady flow of changes going out to ensure business is happy. The downside is this flow can take some time to perfect, and involves a lot of effort spent on automation.