Home > front end >  git: merging a lagging branch to current branch
git: merging a lagging branch to current branch

Time:11-07

Normally when we have to merge some other branch(branch-to-merge) on our current branch(mybranch), we do:

git checkout mybranch
git pull origin branch-to-merge

I understand the way it works is that it would move mybranch pointer to same commit as head of branch-to-merge. If its linear scenario, all goes well, else we need to do resolve merge conflicts.

But how does it play out if branch-to-merge is behind mybranch? I basically want to integrate changes in branch-to-merge(it is lagging) to my current feature branch to make use of code changes made in branch-to-merge.

CodePudding user response:

TL;DR: nothing happens (Git says Already up to date. and stops with a success exit).

Long

Your understanding isn't entirely wrong, but it's missing a crucial detail or two, which will cause issues in a moment:

git checkout mybranch
git pull origin branch-to-merge

... would move mybranch pointer to same commit as head of branch-to-merge.

Let's be precise here. The git checkout mybranch step:

  • removes any files from the current commit (as remembered in Git's index aka staging area);
  • extracts instead the files from the commit indicated by the name mybranch; and
  • makes mybranch the current branch.

(Or, of course, it can fail with various error cases, but we can assume these don't happen. It can also succeed while retaining uncommitted work, as noted in Checkout another branch when there are uncommitted changes on the current branch: this happens when it can skip the remove-and-replace of particular files, which Git generally attempts to do anyway for speed reasons, whether or not those files have uncommitted work in them).

Having finished this checkout, you then run git pull. The pull command is a combination of two other Git commands:

  1. git pull runs git fetch. In older versions of Git, it does this literally (it's a shell script and it actually runs git fetch with various arguments). In current Git, this is built into the pull program, but the effect is the same. This step obtains any commits that origin—a separate Git repository—has that your repository lacks. This step also updates the name origin/branch-to-merge in your own repository, in normal cases.

  2. git pull then runs the second command you've programmed it to run. If you have not set anything, it defaults to using git merge, but you can set it up to run git rebase. As with the fetch step, the old script literally ran these; the new C code has them built in, but the effect is the same. The arguments passed to git merge or git rebase are a bit complicated, but if we assume that you're using git merge here, and that other things are normal, we get the effect of running git merge origin/branch-to-merge (except for the default commit message). But you can also use git merge --no-ff here or, now, git merge --ff-only. If you use git rebase as your second command, the picture is rather more complicated.

If its linear scenario, all goes well, else we need to do resolve merge conflicts.

This has a number of tricky details and corner cases. The git merge command operates under multiple scenarios, and with various arguments (--no-ff, --ff-only), and each of these has different effects. But in general we can describe merge as working this way:

  • First, Git makes sure we have a "clean" working tree (some merge cases don't check, but it's a really good idea to make sure that this is the case yourself if you're using one of these oddball merge variants). Git then locates the current commit hash ID, using the current branch name from HEAD or the raw hash ID from HEAD. This is the "HEAD" or "ours" commit if and when we get a merge conflict.

  • Git also locates the commit specified by the argument to git merge. That is, origin/branch-to-merge translates to a raw commit hash ID. Git finds this particular commit.

  • Git then computes the merge base of the two commits using the commit graph (see Lowest Common Ancestor of a Directed Acyclic Graph). This is the commit that is in some ways the most important in terms of determining the merge result.

We can draw the merge base in various ways. For instance, suppose we have this:

          o--L   <-- mybranch (HEAD)
         /
...--o--*
         \
          o--R   <-- origin/branch-to-merge

Here, the name mybranch locates commit L, the "left side" or "local" or HEAD or ours commit. The name origin/branch-to-merge locates commit R: the other, or "right side" or "remote" or --theirs commit. The commit marked * is the best common ancestor, so commit * is the merge base.

This kind of merge requires a true merge. It may produce a merge conflict, but if it does not, and if a true merge is permitted, the result of this merge is a new merge commit M:

          o--L
         /    \
...--o--*      M   <-- mybranch (HEAD)
         \    /
          o--R   <-- origin/branch-to-merge

Because this commit is made while "on" mybranch (in that git status says on branch mybranch), the branch name mybranch now points to merge commit M.

If there are merge conflicts, Git stops in the middle of the merge, with all three commits read into Git's index, and the merge partly resolved and partly unresolved. Your job as the human operating Git is to finish the merge, which cleans up the mess Git left behind; or you may choose to run git merge --abort to throw away the partial result, cleaning up the mess by, in essence, resetting to commit L. (This usually goes quite badly if you start a "dirty" merge and then use git merge --abort, which is why you should not start a "dirty" merge in the first place.)

If a true merge is required and you specified --ff-only, git merge does not even start the merge: it just says that a fast-forward is impossible and terminates with an error.

Another scenario is illustrated by this drawing:

...--o--L   <-- mybranch (HEAD)
         \
          o--R   <-- origin/branch-to-merge

Here, the "merge base"—the best shared commit that is reachable from both commits L and R—is commit L. This is your linear scenario: Git can simply "move the branch name forward" while also checking out commit R, resulting in:

...--o--L--o--R   <-- mybranch (HEAD), origin/branch-to-merge

This is the default action if you specified --ff-only or did not use --no-ff. However, if you specified --no-ff, Git will go ahead and do a full merge anyway:

...--o--L------M   <-- mybranch (HEAD)
         \    /
          o--R   <-- origin/branch-to-merge

Commit M is a merge commit as before. The snapshot for commit M will match that for commit R, since the merge commit snapshot is made by combining the changes from the merge base (L) to each tip commit (L and R respectively). The changes from L to L are empty, so this produces the changes found from L to R. Git applies those changes to the snapshot in the merge base—i.e., in L—which produces a snapshot that matches commit R. Git then commits the result, as there were no merge conflicts.

But how does it play out if branch-to-merge is behind mybranch?

Note that in one case we already drew:

          o--L   <-- mybranch (HEAD)
         /
...--o--*
         \
          o--R   <-- origin/branch-to-merge

it is the case that origin/branch-to-merge is behind mybranch. It's just that, in spite of being behind by two commits, origin/branch-to-merge is also ahead by two commits. That is, there are two commits, those along the bottom line, that are "on" (reachable from) commit R that are not "on" (not reachable from) commit L.

Let's draw this other case though:

...--o--R   <-- origin/branch-to-merge
         \
          o--L   <-- mybranch (HEAD)

Here, the merge base of L and R is R. Merging is not possible: the diff from R to R is empty, and applying the diff from R to L produces the snapshot in L. Git could still do the same kind of "forced merge" it does when R is strictly ahead of L, but Git won't do that. Instead it just says that it is already up to date.

The bottom line

The merge command:

  1. locates the commits to merge (let's call them L and R for the two-commit case);
  2. locates the merge base (let's call this B); and
  3. uses this to drive the rest of the action.

For a standard two-head merge with recursive or resolve, there are three possibilities, considered in this order:

  • B = R: emit "up to date" message and terminate with success.
  • B = L: consider doing a fast-forward instead of a merge. If allowed, check out commit R, update the stored hash ID in the current branch, and terminate with success.
  • Otherwise, if allowed (no --ff-only), do a full merge. If this is successful (and neither --no-commit nor --squash were specified), make a new merge commit, otherwise stop. For --squash, don't record the hash ID of R in MERGE_HEAD; for other merges, do record the hash ID of R in MERGE_HEAD.

In the special case that B = L = R, we do the B=R test first and say "up to date". The --no-ff option simply forces the middle test to fail so that we go on to the third case.

The --squash option ensures that merge commit M has only a single parent. (It does this by terminating the merge early, as if -n had been specified, which is kind of stupid: you could just run git merge -n --squash if you really wanted that, and it could just make the final commit as a non-merge commit when not using -n, and then declare the merge done.) The -n option ensures that Git doesn't make the final commit right away, which allows you to create an evil merge if you wish.

How to see the merge base

If you're good at eyeballing Git graphs (see Pretty Git branch graphs) you may be able to spot the merge base just by looking, but a lot of graphs are horribly tangled. You can run git merge-base --all:

git fetch
git merge-base --all mybranch origin/branch-to-merge

The second command spits out hash IDs. Ideally, it spits out just one hash ID: that one hash ID is the merge base, and all goes simply from there.

If this prints two or more hash IDs, you have a complex merge. The default (-s recursive, or now, -s ort) strategies will merge the merge bases, creating a new but temporary commit from the result, and then use the result as the merge base. This is simple in theory, but hard to comprehend and even harder to describe well. It's also pretty rare, so you probably won't encounter it.

  •  Tags:  
  • git
  • Related