I'm creating a CI workflow for a 50 Looker developer team. I have read all models in this page but none really fits my requirements. The current flow I have in mind is:

The github repo has three branches: dev, stg and prod. We also have three Looker instances that connect to one of them;
dev is essentially a volatile sandbox and can be re-created rather quickly from prod if any critical file has been deleted;
We cannot afford a release branch method because merging the release branch into prod is a nightmare. Basically we cannot afford any model that requires a big merge from somewhere to stg or prd (in a team of 50 developers, even a daily merge is too problematic);
stg is regarded as a stable environment where developers need to prove that their code works in such an environment before being merged into prod, a second stable environment. A merge into stg should be guarded by automated tests, but not by humans;
prd is final and should be guarded by a human reviewer;
Step 1: Developer A creates a feature branch from dev, say, A_feature_A1, eventually it gets merged into dev;
Step 2: A checks that everything is fine in dev and creates a PR to merge the feature branch into stg. This time a github action is triggered to perform some automated validation and tests;
Step 3: A checks that everything works as expected in stg, and creates a PR to merge the feature branch into prd. This time there is a human approver to approve the PR.

However I have a big issue in Step 1 -- should I create feature branches from dev or from prd/stg? I have tried both and neither works perfectly. If I create feature branches from dev, and because dev is always messed up, it's very difficult to merge the feature branch into either of the two stable branches. If I create feature branches from prd, Looker simply tells me that remote and local are not synced, which makes sense because dev is always different from prd.

How can I resolve this issue? There are a few other restriction comparing to software engineering work:

Looker developers prefer to develop in Looker UI, which puts some restrictions on git operations (e.g. Say developers only have access to Looker-dev instance, which points to the dev branch, they simply CANNOT create feature branches from other branches);
The workflow has to guarantee CI/CD that is, 1) expect a lot of commits everyday and they are expected to be moved into production ASAP, and 2) expect that the dev branch is messed up

CodePudding user response：

You currently seem to have a flow that's similar to the flow used by the git project or a slightly modified GitLab flow model. Also, you seem to have the following constraints:

Looker developers prefer to develop in Looker UI, which puts some restrictions on git operations (e.g. Say developers only have access to Looker-dev instance, which points to the dev branch, they simply CANNOT create feature branches from other branches);

The workflow has to guarantee CI/CD that is, 1) expect a lot of commits everyday and they are expected to be moved into production ASAP, and 2) expect that the dev branch is messed up.

Basically we cannot afford any model that requires a big merge from somewhere to stg or prd (in a team of 50 developers, even a daily merge is too problematic).

You also mention in one of your comments that

Looker developers face business directly and they have to deliver as quickly as possible, sometimes with dirty results.

and

However we cannot afford using a release branch because business cannot wait for say a week/sprint to see results in production.

Approach 1: Adapting your current model

What should you branch from?

I would say you should always be branching off of prd for your feature branches and then merging into dev when you're ready. I say this because with pull requests that are made in GitHub/GitLab/etc. you'll be merging from the common ancestor of the branch you want to merge into up to the current HEAD of the feature branch. So if I branch off of dev or stg and then try to merge my feature into prd, I can potentially pull unstable or not-yet-reviewed code into prd. There are ways around this issue when branching off of other branches such as rebasing or adopting a patch-set workflow with stacked git (if you don't want to use the traditional email workflow). However, as I've never heard of Looker and don't know its full scope of restrictions, I'm going to assume there are limitations in place if it's restricting which branches people can branch from (though I am assuming people can branch off of branches they've created).

Avoiding merge conflicts

Also, git is exceptional at performing merges. Use that to your advantage. You mention that daily merges are too problematic but also that business can't wait a week to see things in production. Taking these excerpts from your post and subsequent comments, it sounds like you're afraid of frequent merges yet still needing to meet businesses' expectations of rapid releases. As long as your dev team is communicating with one another during development and not making massive changes, git should be capable of handling X number of merges per day. If there are merge conflicts, fix them. For advice on how to avoid merge conflicts see Open Bank Project's article and Gehsan's blog post.

Modifying the existing model

If I were to modify your branching model, here's what I would have the branches as defined below.

prd the current state of production
stg the stable testing branch (contains all changes in prd)
dev your primary integration branch (contains all changes in prd)

You should be frequently overwriting dev with prd. This way developers are integrating into a semi-stable environment, rather than an always broken environment. Maybe start with doing this weekly and see where it goes. Yes, developers who have not had their changes graduate to the stg branch will need to re-merge into dev, but this should not be a major issue. Just be sure to announce when you overwrite so developers can check if their branches need to be re-merged. By doing this, it should allow developers to have a more stable integration branch.

Addressing dirty results

You mention developers sometimes have to merge with "dirty results" to please business. This is just all around a bad practice. Even dev is intended for code that's been locally (or in a separate environment) tested by the individual developer and refactored, not code that's hot off the keyboard. If this is something business needs to see, you should have a frank conversation with them. If they're unwilling to budge, I would suggest having your developers create throwaway branches. The idea behind throwaway branches is you create them, maybe merge some other branches into them for testing, and then purge them when you're done. For example a developer working on a feature could do the following:

Create branch feature_A
Develop feature
Create thrw_feature_A from feature_A
Optionally merge stg into thrw_feature_A to ensure they're grabbing other stable updates.
Show the feature to whoever needs to see it on the business side
Delete the thrw_feature_A branch and either continue developing or merge into dev if done.

If that's not an option, you can also adopt feature flags (see the section on Trunk-based development below).

Approach 2: Trunk-based Development

I don't see in any of the links you shared any examples of trunk-based development. It's common in large companies and scales very well. Atlassian has a good article about it. In summary, it's where all changes are merged directly into prd. Features that are not complete or buggy are ignored through the use of feature flags. In this way, small updates to the code can be made to master and then immediately tested and deployed, if stable, by your CI/CD pipeline.

With this approach everyone could simply branch off and merge into master. That would solve your issue with Looker's restrictions and keep a steady flow of changes going out to ensure business is happy. The downside is this flow can take some time to perfect, and involves a lot of effort spent on automation.