Looking up the difference between git pull
and git fetch
, many sources say that git pull
is a superset of fetch, i.e. git pull
is fetch merge.
However, I seem to remember many times where git pull
told me that everything was up to date, but fetch yielded new information.
Can someone explain this discrepancy between theory and reality?
CodePudding user response:
Pull is indeed fetch plus merge.
Except when it's not.
When isn't it? When it's fetch plus rebase, or—very rarely—fetch plus checkout. But in all three cases, it's still:
git fetch
, followed by- some second Git command to do something with the fetched commits.
Where this gets complicated is not so much in the second command—though that second command does complicate things—but rather in the arguments passed from git pull
. Since git pull
is running two other Git commands, and Git commands' actions depend on their options and arguments, it matters what options and arguments git pull
passes to git fetch
and to that second command, whatever it may be.
Aside: a look into history
In the early days of Git, there were no "remotes" like origin
, which meant there were no "remote-tracking names" either. You would run:
git fetch git://name-of-linus-torvalds-machine/repos/foo.git
to get stuff from Linus and then run git merge FETCH_HEAD
, or something along these lines. This was error prone (easy to have a typo in the URL) and annoying, so Git acquired a bunch of temporary methods to deal with this.
Note that with no remotes, all git fetch
could do was leave a bunch of information in .git/FETCH_HEAD
so that you could figure out which branches in Linus's repos had been updated and so on. And of course, git pull
wrapped these two commands into one, so that you didn't have to run two separate commands, and most people used git pull
. But something was clearly missing. So remotes were invented:
- We now had a short simple name like
origin
that we could use instead of a URL. (This got rid of the need for all the weird hacks for naming remotes that are still listed in the documentation, but they're all still in there. Look forNamed file in $GIT_DIR
.)
- We now had a way for Git to save the hash IDs associated with Linus's latest versions, so that we didn't need to create lots of branches locally. The remote-tracking names (
origin/master
and the like) take over a job that would in the past require using a local branch name.
But all these things are still supported and some of them are still described as "the way to do things" in some (ancient) documents, so you can still use the old crude methods. Perhaps some do.
In any case, remote-tracking names now exist. However, between Git 1.7 and Git 2.0, there were some updates to them. Specifically, Git 1.8.4 fixed something eventually declared to be a bug. Some people are still using Git 1.7.x for some strange reason, so be aware that you could hit them.
In Git 2.11, the old git pull
shell script was formally retired. While git pull
still effectively runs git fetch
followed by a second Git command, you can no longer point to the shell script and say: "See, here at this line, it runs git fetch
. Then it has these tests and then it eventually runs this other command..." The result is that it runs much faster on Windows, and is much harder to explain.