When I `git pull --rebase` and get a conflict, how do I `git show` the other person's commit?-CodePudding

So I do git pull --rebase, and there's a conflict. Ok, fine, let's resolve it.

I run git diff and the merge conflict diff is a bit confusing. I think I need to git show the two commits that are being merged, and see how they each independently attempted to modify the same piece of code.

So I run git status to see where we are and I get something like this:

$ git status
interactive rebase in progress; onto abc123
Last command done (1 command done):
   pick def456 Make some changes blah blah blah
No commands remaining.
You are currently rebasing branch 'main' on 'abc123'.
  (fix conflicts and then run "git rebase --continue")
  (use "git rebase --skip" to skip this patch)
  (use "git rebase --abort" to check out the original branch)

[file listing goes here]

The above mentions two commits:

abc123: I think this is the HEAD of the remote branch that I'm pulling into mine. So this is what I ultimately want to rebase onto, but I'm not up to this commit yet, so this doesn't help me.
def456: this is my commit, which is conflict with... something else.

What's not listed there is the other half of the current merge conflict. I know that my commit is def456, but it doesn't tell me anything about the commit made by someone else, which my commit is in conflict with. This is very frustrating as I can't git show blah in order to see what the other author changed in isolation in order to understand the semantics of their change and then merge it with mine.

Currently I go and log into the web UI (e.g. github), look at the list of recent commits, and find the first one after the already-merged abc123, but there must be an easy way to get this from the CLI.

Is there?

CodePudding user response：

As Enrico Campidoglio noted, it's tricky to find the original source of your conflict. I think it's worth expanding on why, though. matt's comment mentioning the :2 and :3 syntax, for looking at the copies of files in Git's index, is fine and correct as far as it goes, but that may not be far enough:

Last command done (1 command done):
  pick def456 Make some changes blah blah blah
...
def456: this is my commit, which is conflict with... something else

The something else is the commit Git is building on, which—depending on what's in the rebase—is quite often one of your own commits that's been copied so far. So the files here can be useful: that's where the immediate conflict is coming from. That might be enough. But where did the original conflict come from?

Rebase, part 1

Let's look at a typical rebase operation and give each commit a simple one-letter fake-hash-ID-name so that they're easy to talk about. We start with:

...--E--F   <-- main, origin/main
         \
          G--H--I   <-- my-new-feature (HEAD)

or—if you haven't been giving yourself good branch names to try to keep yourself-today from confusing yourself-tomorrow—perhaps:

...--E--F   <-- origin/main
         \
          G--H--I   <-- main (HEAD)

That is, you made three commits of your own, G, H, and I. These three commits extend out from the tip of a sequence of commits that was and/or still is found by some branch name in your repository: commit F specifically, which might be found only by a remote-tracking name like origin/main, or might be found by your branch name main if you were working in a branch named my-new-feature.

You ran git fetch and git rebase. You may have done this by entering git pull on the command line, which runs git fetch and then a second command, in which case, you told it to run rebase as its second command. Either way the git fetch obtained some new commits; I'll draw in two here, and update origin/main to point to the last of these two:

          J--K   <-- origin/main
         /
...--E--F
         \
          G--H--I   <-- my-new-feature (HEAD)

I stopped drawing in the name main at all, since we don't need it; if you were (and still are) calling your new branch main, that's the name that finds commit I, but either way we have one name that finds commit I and one that finds commit K, and those are the two names of interest.

Now we execute the second command that git pull would run, i.e., git rebase. The rebase command works by copying some existing commits, that are OK but not quite up to snuff for some reason, to new-and-improved commits. The improvement may involve:

changing the "base" commit (hence the name rebase); and/or
changing the stored snapshot in some way.

While the stored snapshot is the reason for Git's existence, it's the changes between pairs of stored snapshots that humans tend to care about. The rebase operation produces the new and improved commits such that the changes-between-pairs match the original changes-between-pairs:

(F, G) is a pair of commits on your branch, with some changes.
(G, H) is a pair of commits on your branch, with some changes.
(H, I) is a pair of commits on your branch, with some changes.

Git doesn't need to list the pairs directly: it just lists the second commit's hash ID. The first commit in each pair is automatically just the parent commit of that second commit, because these pairs are (necessarily) parent-child pairs.

So git rebase lists out the hash IDs of the three commits to copy: G, H, and I. Then git rebase:

enters detached HEAD mode:

          J--K   <-- origin/main, HEAD
         /
...--E--F
         \
          G--H--I   <-- my-new-feature

starts running git cherry-pick, one cherry (one commit) at a time:

git cherry-pick <hash-of-G>
git cherry-pick <hash-of-H>
git cherry-pick <hash-of-I>

Each of these cherry-pick operations is a merge, of sorts. So each one can fail with a merge conflict. The only one with a clear merge conflict, though, is the first one. Let's explore why.

Cherry-pick vs merge

With a standard git merge, we start with two branches that share some commit in the past:

          o--o--o--L   <-- my-branch (HEAD)
         /
...--o--B
         \
          o--o--o--R   <-- their-branch

We're on branch my-branch, as indicated by HEAD being attached to my-branch. Commit L (the left-side or local or --ours commit) is the current commit, and it has some snapshot. The other --theirs commit, on the right side or remote, is commit R. The common starting point—the merge base—is commit B. Commits B and R also have some snapshot each.

Git will diff (as in git diff --find-renames) commits B and L to figure out what we changed: which files did we touch? What lines did we add and/or remove? Git will then diff B and R, to get the same kind of information about what they changed: which files did they touch and what lines did they add or remove?

Git can now combine these changes. When we touched a file and they didn't, that's easy: take our change. When they touched a file and we didn,t' that's easy too: take their change. When we both touched the same file, that's harder, but Git can combine these changes, and can tell whether we touched the same lines of the file, because we both started from the same file in B, so if we touched line 17 of that file in B, and they only touched line 42, those changes don't collide with each other.

(Git is very line oriented here. Changes are on a line-by-line basis only, and are combined that way. This does not have to be how a merge engine works, but it is how Git's engines work. Technically, Git has a pluggable merge architecture, but of all of the merge engines that you can plug in, all work this way.)

If we have a conflict, we can easily see exactly where it comes from. The merge base version of the file is in index slot 1, so git show :1:file.ext shows us that version of the file. Our version of the file is in index slot 2 (and also in the commit found by the name HEAD), so git show :2:file.ext or git show HEAD:file.ext shows us that version of the file. Their version of the file is in index slot 3 (and also in the commit found by the temporary name MERGE_HEAD), so git show :3:file.ext or git show MERGE_HEAD:file.ext shows us their version of the file.¹

Once the merge is all done, Git makes a merge commit—or has us make one, if we have to solve the conflicts first—and the new merge commit itself also has a (single) snapshot of all files. This is, by definition, the correct result of the merge. Where there were conflicts, Git made us solve them, and so our solutions are the correct ones. We can draw in our merge like this:

          o--o--o--L
         /          \
...--o--B            M   <-- my-branch (HEAD)
         \          /
          o--o--o--R   <-- their-branch

The only thing special about new merge commit M is that instead of just one parent, it has two: the first parent is L, the commit that was ours a moment ago, and the second parent is R. The diff from L to M is the resolved result of merging R, and the diff from R to M is the resolved result of merging L. The resolution here is inherent in the snapshot in commit M.

When we run git cherry-pick, though, the picture is different:

          o--...--P--C--...
         /
...--o--o--...--L   <-- my-branch (HEAD)

We run git cherry-pick C to copy the changes in commit C, the one we want copied. But what are the changes in commit C? Commits hold snapshots, not changes.

To find changes, Git has to find the parent P of child commit C. Git can then run a git diff on P vs C, to find changes. Git should include --find-renames too, in case commit C renamed a file. Git now has to combine those changes with ... what? Commit L, like C, just holds a snapshot.

Git could try to apply those changes as a git diff, and early versions of Git did just that. In fact, up until recently, the standard git rebase ran git format-patch to turn commits into patches suitable for emailing, then ran git am on the resulting email files to apply them. This was the git-rebase--am back end for git rebase, which still exists and can still be used today. It has some flaws: for instance, it doesn't detect renames, and it cannot copy an empty-change-set commit.² So it is, as of Git 2.26, no longer the default for rebase. In any case, it has not been used in git cherry-pick for a very long time.

In any case, what cherry-pick does to detect conflicts reliably is to treat commit P as the merge base for merging purposes. That is, the cherry-pick code runs a merge with P as a faked-up merge base. Commit L, the local or --ours commit, is --ours as always, so for any changes Git finds between a file as it exists in P and as it exists in L, Git will keep our "changes", i.e., keep the file the way it looks in L. Git will then add to those "changes" (that get us our L copy) any changes in the P-vs-C diff.

So, when we run git cherry-pick C—for any reason—the three commits involved are commit C itself of course as --theirs, C's parent P as the merge base, and our own current commit as --ours as usual. If we get conflicts, we presumably know all about our file—it's just our file after all—and we can look at P and C and the changes between them to see what they changed and why.

¹There's some annoying squirrelliness (squirrelly-ness? the built in spelling check does not like even plain squirrelly, though is in fact a word) around renames here. The file named file.ext might have a different name in one or two or even all three commits, yet still be the same file.

²The git am or git apply back-end could theoretically miss a conflict in some cases where git rebase could theoretically detect one. Whether this theoretical miss ever has any practical difference, I'm not sure. Because Git is pretty consistent inside its diff and merge engines, I'm not sure there is a way to trigger this. When git am or git apply does detect a conflict and does the "falling back" to 3-way merge trick, the result ends up being the same as when using cherry-pick. Besides this, git apply was recently taught to notice the index lines and use the 3-way merge earlier, so this may erase any theoretical difference as well. This, however, will depend on your specific Git version, since Git versions predating, say, 2.20 don't use index lines aggressively.

Rebase, part 2

All that stuff about how to handle the cherry-pick operation is fine when we're the ones running git cherry-pick, but during git rebase, Git is the one running cherry-pick, over and over again:

          J--K   <-- origin/main, HEAD
         /
...--E--F
         \
          G--H--I   <-- my-new-feature

Git runs git cherry-pick hash-of-G to copy commit G to a new and improved commit, which we'll call G'.

Knowing how cherry-pick works, and that G is our first commit on our new feature, we can just look at files from commit K when there's a conflict: that's their file. We know that "their" change—parent-vs-child—is really our change in commit G. We can use this knowledge to solve any conflicts, and allow Git to make the new copy G':

               G'  <-- HEAD
              /
          J--K   <-- origin/main
         /
...--E--F
         \
          G--H--I   <-- my-new-feature

But now, Git runs git cherry-pick hash-of-H to copy commit H to a new and improved commit, which we'll call H'. The result will—if all goes well—look like this:

               G'-H'  <-- HEAD
              /
          J--K   <-- origin/main
         /
...--E--F
         \
          G--H--I   <-- my-new-feature

At this point Git will run git cherry-pick hash-of-I to copy commit I to a new and improved I'. If this has a conflict, we're still in the state we drew above. The --ours commit is commit H': our copy of H. The --theirs commit is commit I: our change from H to I. The conflict is in some file in the --ours commit, H', vs some file in the --theirs commit, I, as compared to some file in commit H. All three commits are ours!

The problem here is that the conflicting part in our commit H', as diffed against merge base commit H, actually came from commit J or K. This is the key realization. Without it, Enrico Campidoglio's answer doesn't make sense. Once we do realize it, the answer suddenly makes sense. If some line in H-vs-H' conflicts with our own H-vs-I change, we must find the right line(s) in commits J and/or K, to understand why they made their change.

One problem with just using the line numbers from H' is that they may not match the line numbers in J and/or K. What we really need is the line numbers as they existed—and still exist—in J and/or K before Git imported our commits G and H.

We want to look at commits in the --onto target—here, origin/main or commit K—that are not in the original source branch: here, that's commit F, which may not have a name. Git's reflogs can be helpful, as origin/main@{1} will contain the hash ID of commit F. Or, if we are using our own branch name, and have a branch name main around, pointing to commit F, we can use that. Running:

git log origin/main@{1}..origin/main

or:

git log main..origin/main

will show us the log messages for J and K; adding -p will show those as patches, though because git log is unhelpful with merge commits we might want git log -m -p or git log --cc -p if there are potential merge commits in that range.

Once we do resolve the conflict and continue, rebase will finish with the last copy, and then yank the branch name over:

               G'-H'-I'  <-- my-new-feature (HEAD)
              /
          J--K   <-- origin/main
         /
...--E--F
         \
          G--H--I   ???

All of our conflict resolutions are now recorded in our three copies, just as when we did a merge and recorded the conflict resolution in the snapshot in merge commit M.

Conclusion

There is no perfect answer here. The conflicts are often in commits that are all, in some sense, "our" commits. They are occurring because of commits that come before these commits, in which case we might find them in those earlier commits; but if we had to resolve some conflicts during one of our earlier cherry-picks, they may even be sui generis, arising from our own earlier resolution.

CodePudding user response：

[...] it doesn't tell me anything about the commit made by someone else, which my commit is in conflict with.

While this is trivial to do during a merge operation, thanks to the --merge option of git log:

--merge
After a failed merge, show refs that touch files having a conflict and don’t exist on all heads to merge.

Unfortunately, it doesn't work during a rebase due to the lack of MERGE_HEAD (a reference to the commit being merged into the one referenced by HEAD).

The good news is that you can work around this limitation by manually looking for commits that touch the conflicting lines. It's just a bit more involved.

First off, get the paths of the conflicting files:

# 'U' means unmerged
git diff --diff-filter=U --name-only

Then, identify the range of lines from the "other side" of the conflict. Since this is a rebase operation, "ours" is actually "theirs", so you'll be looking at the HEAD section:

1 │ <<<<<<< HEAD
2 │ Here's some content
3 │ =======
4 │ Here's some different content
5 │ >>>>>>> def456 (Your commit)

With these two pieces of information, you can now get a list of commits that touched the conflicting lines starting from the commit you're rebasing onto (in your example, abc123) using the -L option of git log:

git log --oneline -L1,2:path/to/conflicting/file abc123

Note that you'll have to take into account the conflict markers when identifying the range of lines that contain the conflict.

CodePudding user response：

In this scenario you probably don't care about what changed on the target branch from a specific commit standpoint. Instead, what matters is the changes (per file) on the target branch since your branch diverged. So the diff you're asking about ("how they each independently attempted to modify the same piece of code") is between the merge-base of your branch and the target branch, and the tip of the target branch which you are rebasing onto. For example, if you have branch abc123 checked out and rebase it onto main, you would want to find the merge-base of these 2 branches:

git merge-base main abc123

and then compare that commit with main to see the total changes between those 2 commits. You can drill down to a per file basis to see the exact changes that happened on the target branch when determining how to resolve conflicts with your current commit during the rebase. (Note the order of branches doesn't matter with the merge-base command, in case you're rebasing the other way.)

Tip: If I'm planning to eventually squash my branch into fewer commits, I usually find it better to do so before I perform a branch rebase onto the target branch. This way I'll have fewer total commits that can possibly conflict when I do rebase onto the target. (Otherwise you sometimes have to step through multiple conflicts and the rebase takes longer.)

Side note: Your question contains this text:

rebasing branch 'main' on 'abc123'.

which feels backwards to me because typically you don't rewrite commits on branches such as main. Not sure if that was intentional or not.