Home > Software engineering >  Will rebase include the old feature branch in a PR?
Will rebase include the old feature branch in a PR?

Time:08-26

There is a feature branch I submitted a PR for, and it hasn't been approved yet. Now there is a new feature I have to work on that should be based on that feature branch. I thought of using re-base to start working on the new branch with the latest commits from the other feature branch (that hasn't been approved/merged yet). In this case, when I submit a PR for the new feature branch, will that PR include the commits of the old feature branch, or only the commits of the new feature branch?

CodePudding user response:

Lets assume for master commits are m1-m2.

Your feature A branch rebased on master as m1-m2-a1.

PR for branch A has one commit. a1

if you rebase feature B on feature A, your tree is m1-m2-a1-b1

PR for branch B will have 2 commits a1,b1

once your PR1 is complete with rebase your master will be m1-m2-a1. your PR for branch b will eventually have only 1 commit b1.

CodePudding user response:

Git itself is totally ignorant of pull requests (which are a GitHub or Bitbucket feature if you're using GitHub or Bitbucket: note that GitLab has something similar, but they are called MRs, Merge Requests). So git rebase doesn't know that some commit(s) are in some PR(s); it literally cannot know this.

Answering this question correctly therefore requires defining both the PR process and the rebase operation. Since you haven't specified the hosting site, defining the PR part is not possible, but we can generalize a bit:

  • To make a pull request or merge request on some hosting site, we must git push some set of commits to that site, then use the features of that site to generate the PR/MR. The PR/MR goes into some non-Git auxiliary database where people can inspect it and fuss with it and do whatever, but in any case, there is some set of commits involved, and initially, at least, those commits include the ones you sent with git push.

  • git rebase is all about copying commits: we have some existing set of commits, where we like some thing(s) about those commits, but we dislike some other thing(s) about those commits. No commit can ever be changed: the actual name of a commit is its hash ID, and its hash ID is determined by doing a hashing operation on its content (specifically on the metadata but the metadata depend on the data as well, so we get a Merkle-tree-ish setup, where the final hash covers everything).

When we copy the commits using git rebase, we then instruct our Git to stop using the old commits in favor of the new, copied commits. This has no effect on copies that are in other Git repositories, because it literally can't: those copies are their copies, and like all commits, cannot be changed. We may be able to instruct those other Git repositories to stop using the original commits and start using others instead. If, when, and how that influences any Pull Requests is up to those other sites.

When we use git rebase, we must tell our own Git software two things:

  • Which commits should it copy? Which ones shouldn't it copy? (This is one thing, with two sides, like the two sides of one coin.)

  • Where should the copies go?

Git finds commits through branch names, and for git rebase, the end result of the rebase is reflected by taking one branch name—just one, at least for now—and making it find the new copies instead of the originals that we copied.

So, we have some branch names that find some commits:

...--G--H   <-- main
         \
          I--J   <-- feature1
              \
               K--L   <-- feature2 (HEAD)

Here, we are "on" branch feature2 in our Git repository. The name feature2 finds commit L specifically: that commit is the tip commit of the current branch. Commit L finds commit K, which finds commit J, which finds commit I, which finds commit H, and so on, backwards, in Git's usual fashion.

Meanwhile, branch name feature1 finds commit J: commit J is the tip commit of branch feature1. The name main finds H, so H is the tip commit of branch main. All the commits are on branch feature2, even though commits up through and including H are on main, and commits up through and including J are on feature1.

Git has two ways to use a branch name (or anything that finds any one specific commit): it can either select just that one commit, or it can select that one commit with history. The "select with history" operation is kind of like a flood-fill, except that it only goes one direction, namely backwards. For ordinary commits that have just one parent, it's straightforward; for merge commits, that have two or more parents, that's when we need something more complex.

In any case, the general form of rebase is:1

git rebase --onto <target> <upstream>

The upstream argument is the "what not to copy" specifier. Git selects this with history, "flood filling" the commit graph with temporary "red paint". The commits thus painted won't get copied.

The commits that do get copied are then selected via HEAD, i.e., the current branch name.2 Git selects this with history, "flood filling" the commit graph with temporary "green paint", but not overwriting any red. So only the commits not specifically de-selected get copied.

The copies then go after the commit specified by the target argument.

If you omit the --onto and target argument, the target for the rebase is the commit specified by the upstream argument. This works well for cases like:

...--o--*--o--o--o   <-- mainline
         \
          A--B--C   <-- feature (HEAD)

where we just run git rebase mainline. Git paints all the o and * commits red (temporarily), then paints C-B-A green and stops because * is red, and thus copies C-B-A.

Having chosen commits to copy, Git makes sure to put them into the correct order. Git also knocks out specific commits:

  • If --fork-point is active, commits via the fork-point selection are dropped.
  • Unless --preserve-merges (now deprecated) or --rebase-merges are in effect, all merge commits are dropped.
  • "Patch-ID-equivalent" commits are dropped (Git uses the symmetric difference code for this, so it's really using upstream...HEAD rather than upstream..HEAD as claimed in the documentation).

The remaining commits form the list to be copied (or, for interactive rebase, turned into pick commands, before those are perhaps modified by --autosquash).

Having generated the correct list, Git then uses git switch --detach target to get to the commit where the copies go. It then uses git cherry-pick or equivalent3 to copy each commit.

(There is, however, a short-cut that rebase will sometimes use here: if the commit to be copied is the next commit, Git will fast-forward over that commit. You can suppress this with git rebase --force or similar options: see the documentation.)

Once all the commits are copied (or fast-forwarded into place if allowed and possible), Git will take the original branch name, whatever it was, and force it to point to the last-copied commit. For our mainline-and-feature example just above, the result is this:

                   A'-B'-C'  <-- feature (HEAD)
                  /
...--o--*--o--o--o   <-- mainline
         \
          A--B--C   [abandoned]

The temporary red-and-green-paint stuff is automatically deleted: in fact, it never got into the repository at all, as it's just in memory while Git is running the symmetric difference operation to generate the list of commits to copy. (It's not even "paint", it's just some bits in some data structures. The paint idea is to help visualize the effect.)


1You can add one more parameter, <branch>, but that just means *run git switch to that branch first, then go back to the general form. When the rebase is done, you're "on" the given branch, as if you had run git switch yourself. So I prefer to discount this case.

2If you run git rebase while in detached HEAD mode, Git still selects the commits the same way. It just omits the "move one branch name" step at the end.

3Older Git versions use git format-patch and git am by default here; you have to force them to use the cherry-pick mode, except for interactive rebase, which requires actual cherry-picking.


Interactions with Pull Requests

Suppose we start with:

...--G--H   <-- main
         \
          I--J   <-- feature1
              \
               K--L   <-- feature2 (HEAD)

and git push both feature1 (commits I-J) and feature2 (K-L) to a GitHub repository, then use the GitHub CLI or web interface to make two PRs. The feature1 PR will ask someone to add commits I-J to some branch of some GitHub fork. The feature2 PR will ask someone to add commits I-J-K-L to some branch of some GitHub fork.

Whether or not anyone has done any of this we now run git switch feature2 && git rebase --onto main feature1, using the --onto operation to produce:

          K'-L'  <-- feature2 (HEAD)
         /
...--G--H   <-- main
         \
          I--J   <-- feature1
              \
               K--L   [abandoned]

No change has occurred in any repository on GitHub yet.

If we now git push the updated feature2 to our GitHub fork from which we generated the feature2 PR, GitHub will—after requiring us to use --force with this git push operation—update both our fork and our PR. Our new PR will have commits K'-L' in it. If someone has already merged the old commits or otherwise used them in any way, we're too late. They have our K-L commits. If not, they can still get to our K-L commits, but the easy operations they have easy access to will use our K'-L' commits.

If we don't use --onto, and copy commits I-J-K-L with --force or in any other way such that we get:

          I'-J'-K'-L'  <-- feature2 (HEAD)
         /
...--G--H   <-- main
         \
          I--J   <-- feature1
              \
               K--L   [abandoned]

and git push --force this to our GitHub fork to update or feature2 PR, we've now included the new copies, I'-J', as part of that PR. Should someone literally merge both our PRs, both copies of these commits will wind up in their merge result. But GitHub have "accept this PR" buttons that don't actually use git merge, but rather do the equivalent of either git rebase --force or git merge --squash. In those cases, none of our commits (by hash ID) wind up in their merge result. Our code may or may not wind up in their merge result.

Conclusion

The bottom line, as it were, is that you must know what rebase does and look at it terms of which commits go where in the commit graph. But this only tells you about your repository. When you git push to your GitHub fork, this adds any new commits you have and adjusts the branch names in your fork to point to the new commits as requested or commanded by your git push (without force = polite request, with --force or --force-with-lease or similar = command).

On GitHub, pushing to a branch that has an open PR causes the other guy, whoever that is, to see the new commits as part of the PR. The old ones are still accessible, but much harder to find than the new ones. On other systems, you get whatever they give you.

Always draw the graph (in your head if nowhere else).

  •  Tags:  
  • git
  • Related