What happens if I use git pull (merge) and squash my commit into master?-CodePudding

I'm unsure if this strategy is causing us more issues than it's worth. Let's use this as an example (pulled from git docs).


    A---B---C---D---E origin/master
        \     
         X---Y--- feature/branch

I've created a feature/branch of origin/master at commit B. I've committed X and Y locally. C, D, and E have been committed to origin by other developers. I then use git pull origin master to pull down latest changes from master into my feature/branch.

    A---B---C---D---E origin/master
        \            \
         X---Y--------M feature/branch

This appears to take CDE and creates a merge commit in my feature branch M. I then squash my feature branch creating commit Z and push it into origin.

    A---B---C---D---E       origin/master
        \            \     /
         X---Y--------M---Z feature/branch

It's possible I'm misunderstanding the pull merge.

Does origin now contain A,B,C twice?

Should we use rebase instead?

Any suggestions are greatly appreciated.

CodePudding user response：

Note: For the purposes of this answer, let's assume you have a local branch named master which is equivalent to origin/master, and another local branch named feature which is equivalent to feature/branch.

I believe your question is arising from the fact that your third graph is incorrect. Given your second graph as is:

    A---B---C---D---E (master)
        \            \
         X---Y--------M (feature)

If you then wish to squash feature onto master the resulting graph would be:

A---B---C---D---E---Z (master)

In this case commit Z would contain all of the changes from both commit X and Y, and also M if there were any changes in the merge commit.

If you had elected to rebase feature onto master instead of merging (in your case using pull), the graph would look like this:

A---B---C---D---E (master)---X'---Y' (feature)

Where X' and Y' represent the re-written commits of X and Y. If you then squashed those two commits down into one, you would be in the same place as the squash merge. (And then you would merge feature into master to get master up to date.)

Does origin now contain A,B,C twice?

Assuming you push master to origin, in neither of these cases would origin (master) now contain A,B,C twice.

That being said, perhaps you conceptually had it backwards because it looks like your graph actually attempted to squash master onto feature. Had you done that, since master was already merged into feature, that subsequent squash merge would have had no effect meaning Z would not even have been created. However, had you not merged master into feature first, and then squash merged master onto feature, perhaps you would have gotten closer to what you were thinking, since the graph would look like this:


    A---B---C---D---E (master)
        \     
         X---Y---Z (feature)

And now Z would contain the changes from C,D, and E as you proposed. Then merging that back into master would look something like this:

    A---B---C---D---E---M (master)
        \              /
         X---Y--------Z (feature)

In that case Z is a pointless commit which you didn't need, and essentially the changes of the 3 commits are there twice. (Similar to when you cherry-pick commits onto other branches and then merge them back in later.)

For your last question:

Should we use rebase instead?

It you're comparing merge vs rebase vs squash, it partially comes down to what you wish your graph to look like, but perhaps more importantly which information you wish to retain long term:

Squash has the least amount of information retention. It will contain the changes only. It does not retain any of the information about the authors, dates, developer's desired order and contents of commits, or original branching points at the time the development began.
Rebase retains the information about the authors, dates, and developer's desired order and contents of commits, but does not retain the original branching points at the time the development began.
Merge retains the information about the authors, dates, and the original branching points at the time the development began, but generally does not have the developer's desired order and contents of commits, but instead retains the actual order and contents of commits as they occurred.

Which one you choose is a matter of taste. For feature branches I personally prefer the rebase option, but only if the developers actually create meaningful commits that provide value on their own, and if multiple developers sharing a feature branch are comfortable with communicating force pushes and resetting their branches properly. If feature branch development doesn't meet that criteria I would prefer squash. For long-lived shared branches such as with Git Flow, I prefer merge (and --no-ff too). I would never rebase a long-lived shared branch.