Home > Software engineering >  Git merge conflicts when only one person is changing the file
Git merge conflicts when only one person is changing the file

Time:11-09

I was working on a feature and merged all changes to a specific hotfix branch provided by our test infra team. The branch was merged into our release branch with no conflicts. Later, when the test infra team was trying to merge the release branch back to other hotfix branches, they saw lots of merge conflicts.

My question is, what caused those conflicts if I am the only one who was working on those files.

Moreover, when I tried to merge the release branch and hotfix branch locally, I also saw lots of conflicts caused by others' code changes. So is manually cherry-picking all my commits to the hotfix branch the only way to fix this issue?

Appreciate your help!

CodePudding user response:

Merge is about combining work. It doesn't matter who did the work. What matters is the work, as recorded in commits.

Commits themselves have structure. Each commit stores two things:

  • Every commit acts as a full snapshot of every file.

  • Every commit stores metadata: information about who made the commit, when (date and time stamps), why (log message), and so forth.

Every commit is numbered, using a unique but unpredictable and random-looking hash ID. In the metadata for any given commit, Git includes the raw commit hash ID of the previous commit(s). This forms most commits into simple, backwards-looking chains:

... <-F <-G <-H

Here, H stands in for the hash ID of the last commit in the chain. Commit H contains the raw hash ID of earlier commit G. Remember that the hash IDs are big, ugly, and unpredictable, so there is no other way to tell that, e.g., c6fc44e9bf85dc02f6d33b11d9b5d1e10711d125 comes before or after e9e5ba39a78c8f5057262d49e261b42a8660d5b9 (if H is one and G is the other). There's no simple less-than or greater-than operation you can do here, other than have Git read through the metadata: if H points to G, then G comes right before H. Similarly, G points backwards to F; so F precedes G and G precedes H.

The graph we can build from this "precedes" notion, in Git, is a directed acyclic graph or DAG. The DAG provides what Git needs to perform git merge operations. Suppose that, starting from a repository where some main branch ends at commit H like this:

...--G--H   <-- main

we add two branches br1 and br2, and on these two branches, we place two new commits each:

          I--J   <-- br1
         /
...--G--H
         \
          K--L   <-- br2

(the name main still exists, and still points to commit H; I've just removed it from the drawing to de-clutter it). Note that commits up to and including H are on all branches.

If we now git checkout br1 and then run git merge br2, Git:

  • starts out on commit J, as that's the one we had Git copy out of the repository and are now using via name br1;
  • locates commit L, because br2 points to L;
  • finds the merge base commit.

The merge base of a pair of branch-tip commits is the best shared (common) commit. That is, Git looks at commit J and asks if it's on both branches. It isn't, so Git has to go back one to I. That's not on both branches either, so Git has to go back again to H. Or, equivalently, Git can start from K and go back twice to get to H. Having gotten to H, though, the answer is now: yes, this commit is on both branches. So it is a viable merge base. It's also the best one, for reasons we won't go into here, so it is the merge base commit.

The merge operation therefore involves commits H, J, and L. No other commits are needed at this point. Git now runs two git diff operations:

  • Git compares the snapshot in H to that in J. Whatever is different is stuff "we" changed, on branch br1.

  • Git compares the snapshot in H to that in L. Whatever is different is the stuff "they" changed, on branch br2.

It doesn't matter who "we" and "they" are, in terms of who made which commit. What matters are the snapshots in these three commits, and the result of these two comparisons.

The output of each git diff run here is:

  • a list of changed files, along with
  • for each file, the changes that, if applied to the H version of the file, produce the branch-tip version of that same file.

What Git can now do is to combine the two sets of changes. This keeps our changes (H-vs-J) and adds their changes (H-vs-L), or, equivalently, keeps their changes and adds ours. There's no real difference between these two operations as long as there are no conflicts. Git applies the added-together changes to the files from H, and the result is a new snapshot, which goes into a new merge commit M:

          I--J
         /    \
...--G--H      M   <-- br1 (HEAD)
         \    /
          K--L   <-- br2

Merge commit M causes the name br1 to advance, because that's the current branch, and adding a new commit to the current branch moves the branch name to point to the new commit. Merge commit M has an ordinary snapshot, just like any ordinary non-merge commit. What makes merge commit M a merge commit is that it points back to commit J (as usual) but also to commit L. That is, M has two parents, instead of the usual one.

If there are conflicts, Git's merge:

  • leaves a mess in Git's index, which we must clean up;
  • leaves the conflicts in the working tree version of the file, which we can use to clean up Git's index;

and stops with an error. We clean things up, fix up Git's index with git add, and use git merge --continue or git commit to tell Git that we cleaned everything up and it should go ahead and make merge commit M now.

The source of the conflict here is not relevant. It must come from (H-vs-J)-vs-(H-vs-L). The authors and committers of those three commits are irrelevant. What matters are the snapshots in those commits, which generated those diffs, which conflicted. You will be the author and committer of the merge commit. The existing commits cannot and will not be changed. Your responsibility, at this point, is just to make the new merge commit.

  •  Tags:  
  • git
  • Related