Home > front end >  Diff HEAD and merge-branch against parent for merge conflicts
Diff HEAD and merge-branch against parent for merge conflicts

Time:02-28

For better or worse, I regularly encounter in my Git workflow scenarios with fairly large merge conflicts. I find the best way to resolve these is typically to get the diffs of a) the HEAD branch against the parent branch, and b) the merge-branch against the parent branch, in order to manually inspect the changes and incorporate bits of both. However, Git and Visual Studio Code does not at all make this easy. Is there any solution, ideally specifically for VS Code, that allows me to inspect these two diffs for each merge conflict?

CodePudding user response:

git mergetool is what Git provides as a building block here. It is by no means perfect, but it provides the three key inputs: the merge tool that git mergetool runs, which you may construct yourself, gets as its arguments your own expansion(s) of these variables that hold file names:

$BASE
$LOCAL
$REMOTE
$MERGED
$BACKUP

Not all files will always be present. In particular, $BACKUP is omitted if you tell git mergetool that it can trust the merge tool's exit status. Tools should generally ignore the $BACKUP variable, but see below.

Your tool's job is to produce, in $MERGED, the correct resolved file data. You may read the text of the merge base input from $BASE. The HEAD or --ours version of the file is in $LOCAL, and the other or --theirs version of the file is in $REMOTE. Your tool can use any code you can think of and write, including running diff commands from $BASE to $LOCAL and/or from $BASE to $REMOTE, if you wish.

How this all works

Note that git mergetool operates on one file at a time. That is, git merge has already made its best attempt at doing the merge on its own. This attempt has failed (otherwise you would not be running git mergetool).

At git merge time, Git read (past tense) three sets of files from three commits:

  • a merge base commit, containing whatever files are in that snapshot;
  • an "ours" commit (HEAD), containing whatever files—or in some cases, such as with git stash apply, the files in the working tree were copied into the index and those were used as the "ours" commit; and
  • a "theirs" commit (the other commit, --theirs or "remote" here).

Let's assume for illustration that in the merge base commit we had a file foo.txt, in the --ours commit we had foo.txt, and in the other commit we had foo.txt, and there was a merge conflict at lines 100–120 or so, plus non-conflicting changes on earlier and later lines.

Because Git did its best to combine our changes with their changes, there is a foo.txt file in the working tree right now that contains Git's attempt. This has Git's decision as to what the right result was for the earlier and later lines, and, in place of the original lines 100–120 or so, conflict markers and the "ours" and "theirs" sections of the corresponding files (plus the corresponding "base" section if using diff3 or the new zdiff3 format).

Because Git failed to fully resolve the file, though, Git's index now contains all three input files: :1:foo.txt has the base version, :2:foo.txt has the --ours version, and :3:foo.txt has the --theirs version in it. The git mergetool command copies these three versions of foo.txt out of the index—the index copies are in Git's internal read-only compressed-and-de-duplicated format, unreadable by anything other than Git itself—into ordinary files with peculiar temp-file names.

Now that these four versions of the file exist (base, local, "remote"/theirs, and merged), the git mergetool command sets up four shell variables, $BASE, $LOCAL, $REMOTE, and $MERGED to hold the four file names that hold these contents. Since $MERGED is the simple file name (foo.txt) it doesn't have any funny dots or pids or /tmp or whatever squeezed into its name, but your program should just use $MERGED to find the file's name. Again, this is where your program—your merge tool—should write its final result.

If you don't tell Git to trust your command's exit status, Git will copy the initial (pre-tool-run) copy of $MERGED to a temporary file named $BACKUP. After your tool exits, regardless of its exit code, Git will compare the contents in $BACKUP against the contents in $MERGED. If the two files match, Git will assume the tool failed and the file is not resolved yet. If the two files differ, Git will assume the tool succeeded and the file is resolved correctly.

If you do tell Git to trust your command's exit status, Git doesn't bother with this, and just looks for zero (succeeded) or nonzero (failed) as the exit status.

In either case, having run your command to completion, Git now uses the success/failure indicator to decide whether to run git add on $MERGED. If Git believes the tool has succeeded, git mergetool will run the git add command, which will mark the file resolved (knocking out the three nonzero-stage index entries and putting in a single stage-zero entry). If Git believes the tool has failed, git mergetool does nothing, leaving the three nonzero-stage index entries that mark the file as unresolved, enabling later resolution and keeping the three inputs available.

The mergetool code then cleans up its temporary files, and goes on to the next unresolved file. It simply iterates through all unresolved files, one at a time, running your chosen merge tool as described above. Or, you can run git mergetool f1.ext f2.ext to limit it to resolving just those two files, for instance.

Since you can provide a shell script, binary executable, or whatever you like as the merge tool, you can get it to behave any way you like: it's just a Small Matter of Programming. The painful part here is that it's always run one file at a time, so it never gets a "global view" of the diffs across all files. (Also, for tree-level conflicts, such as rename/rename or modify/delete, git mergetool falls back on rather primitive prompting.)

CodePudding user response:

However, Git ... does not at all make this easy

I can't agree with that. If you've set up your Git (thru git config) with merge.conflictStyle set to diff3, conflicts are written with three pieces of information: the current branch (HEAD); the incoming branch; and the common parent (the merge-base commit). I find that that's exactly the information needed.

Also there are other GUIs that always give you this triple information (but that would take us outside the scope of the question).

  • Related