Git fetch and git pull relationship-CodePudding

Looking up the difference between git pull and git fetch, many sources say that git pull is a superset of fetch, i.e. git pull is fetch merge.

However, I seem to remember many times where git pull told me that everything was up to date, but fetch yielded new information.

Can someone explain this discrepancy between theory and reality?

CodePudding user response：

Pull is indeed fetch plus merge.

Except when it's not.

When isn't it? When it's fetch plus rebase, or—very rarely—fetch plus checkout. But in all three cases, it's still:

git fetch, followed by
some second Git command to do something with the fetched commits.

Where this gets complicated is not so much in the second command—though that second command does complicate things—but rather in the arguments passed from git pull. Since git pull is running two other Git commands, and Git commands' actions depend on their options and arguments, it matters what options and arguments git pull passes to git fetch and to that second command, whatever it may be.

Aside: a look into history

In the early days of Git, there were no "remotes" like origin, which meant there were no "remote-tracking names" either. You would run:

git fetch git://name-of-linus-torvalds-machine/repos/foo.git

to get stuff from Linus and then run git merge FETCH_HEAD, or something along these lines. This was error prone (easy to have a typo in the URL) and annoying, so Git acquired a bunch of temporary methods to deal with this.

Note that with no remotes, all git fetch could do was leave a bunch of information in .git/FETCH_HEAD so that you could figure out which branches in Linus's repos had been updated and so on. And of course, git pull wrapped these two commands into one, so that you didn't have to run two separate commands, and most people used git pull. But something was clearly missing. So remotes were invented:

We now had a short simple name like origin that we could use instead of a URL. (This got rid of the need for all the weird hacks for naming remotes that are still listed in the documentation, but they're all still in there. Look for Named file in $GIT_DIR.)

We now had a way for Git to save the hash IDs associated with Linus's latest versions, so that we didn't need to create lots of branches locally. The remote-tracking names (origin/master and the like) take over a job that would in the past require using a local branch name.

But all these things are still supported and some of them are still described as "the way to do things" in some (ancient) documents, so you can still use the old crude methods. Perhaps some do.

In any case, remote-tracking names now exist. However, between Git 1.7 and Git 2.0, there were some updates to them. Specifically, Git 1.8.4 fixed something eventually declared to be a bug. Some people are still using Git 1.7.x for some strange reason, so be aware that you could hit them.

In Git 2.11, the old git pull shell script was formally retired. While git pull still effectively runs git fetch followed by a second Git command, you can no longer point to the shell script and say: "See, here at this line, it runs git fetch. Then it has these tests and then it eventually runs this other command..." The result is that it runs much faster on Windows, and is much harder to explain.