I'm reading John Wiegley's Git from the bottom up. In A commit by any other name… he mentioned:
name1..name2
— This and the following aliases indicate commit ranges, which are supremely useful with commands like log for seeing what’s happened during a particular span of time. The syntax to the left refers to all the commits reachable from name2 back to, but not including, name1. If eithername1
orname2
is omitted,HEAD
is used in its place.
master..
— This usage is equivalent tomaster..HEAD
. I’m adding it here, even though it’s been implied above, because I use this kind of alias constantly when reviewing changes made to the current branch.
I'm confused here: HEAD
shall always be the last commit of master
right? then what does it mean by master..HEAD
?
CodePudding user response:
HEAD
shall always be the last commit ofmaster
right?
Only if you are currently "on" branch master
.
Let's be totally concrete with a few examples, so that you can see how this works. We'll start by creating a new, totally empty Git repository:
$ mkdir example && cd example
$ git init
Initialized empty Git repository in ...
$ echo just an example > README
$ git add README
$ git commit -m initial
[master (root-commit) a3de110] initial
1 file changed, 1 insertion( )
create mode 100644 README
You will get a different hash ID than I did, and if you set up your Git to create new repositories using a different branch name (e.g., main
), you will get a different branch name, but either way the:
[<some name here> (root-commit) <abbreviated-hash>] initial
line tells you several things:
This is a root commit, which in this case is the very first commit in this new repository. The first commit is kind of special because it is always a root commit (and often is the only root commit). Root commits aren't particularly interesting. In a sense, they're particularly dull, because they are where all the action stops.
The current branch name is (in my case)
master
.The abbreviated hash ID of my new commit was
a3de110
.
The full hash ID of any commit is its "true name": this name works in every Git repository and means that commit. Since I just made it in my repository, and you don't have the commit I just made, you don't have this commit, and therefore you don't have this hash ID either. These hash IDs are universally unique!1
Anyway, let's take a look at the name HEAD
right now:
$ git rev-parse --symbolic-full-name HEAD
refs/heads/master
$ git rev-parse HEAD
a3de1101707189f42f01b50fed47aa350398f49a
Here we see that, yes, we're on branch master
—its full name is refs/heads/master
—and its full hash ID is that big ugly hexadecimal number.
1We can prove that this is mathematically impossible (it's obvious from the pigeonhole principle), and since Git depends on this impossibility, we've proven that Git will eventually break. In practice, however, Git keeps working for decade after decade: the actual chance of breakage is similar to (and almost always actually smaller than) the chance that you'll be hit by lightning and die while you sit at your keyboard and work in your repository. It's theoretically possible, it just doesn't happen often enough in real life to worry about.
There aren't enough commits to be interesting yet
Since a repository with just one commit is so dull, let's make several more commits. Let's also make some additional branch names:
$ echo one > file1
$ git add file1
$ git commit -m 'add a first file'
[master 0f553c5] add a first file
1 file changed, 1 insertion( )
create mode 100644 file1
$ echo two >> file1
$ git add file1 && git commit -m 'add a second line to file1'
[master fbdddac] add a second line to file1
1 file changed, 1 insertion( )
$ git switch -c b1
Switched to a new branch 'b1'
$ echo two > file2 && git add file2 && git commit -m 'add second file'
[b1 51a610e] add second file
1 file changed, 1 insertion( )
create mode 100644 file2
$ git switch master
Switched to branch 'master'
$ git switch -c b2
Switched to a new branch 'b2'
$ echo three > file3 && git add file3 && git commit -m 'add different second file'
[b2 e683c47] add different second file
1 file changed, 1 insertion( )
create mode 100644 file3
We now have three branch names (master
, b1
, and b2
) and are currently "on" b2
:
$ git log --all --decorate --oneline --graph
* e683c47 (HEAD -> b2) add different second file
| * 51a610e (b1) add second file
|/
* fbdddac (master) add a second line to file1
* 0f553c5 add a first file
* a3de110 initial
This shows how Git draws the graph (see also Pretty Git branch graphs), and if we use git rev-parse
we can see that HEAD
refers to b2
, which means the commit whose hash starts with e683c...
:
$ git rev-parse --symbolic-full-name HEAD
refs/heads/b2
$ git rev-parse HEAD
e683c472999de3d39c3e69d030b362df561b5711
$ git rev-parse b2
e683c472999de3d39c3e69d030b362df561b5711
Now we have enough stuff to work with
For StackOverflow discussion purposes, I like to draw my graph horizontally instead of vertically. Git puts the newest commits towards the top; I put them towards the right. Here's the same graph, except that instead of big ugly hash IDs, I use single letters to stand in for each commit, and draw newer commits towards the right:
D <-- b1
/
A--B--C <-- master
\
E <-- b2 (HEAD)
The newest commit, add different second file
(e683c...
), is on the bottom row. The branch name b2
—that is, refs/heads/b2
—points to this commit, or in other words, git rev-parse b2
prints out that big ugly hash ID.
The special name HEAD
is currently attached to the name b2
. So HEAD
means b2
right now.
If we run:
git log master..b2
or:
git log master..HEAD
or:
git log master..
(all of which currently mean the same thing) we'll see commit e683c...
:
$ git log master..
commit e683c472999de3d39c3e69d030b362df561b5711 (HEAD -> b2)
Author: Chris Torek <[email protected]>
Date: Mon Dec 19 01:00:28 2022 -0800
add different second file
And indeed, that's just what we see. If I switch back to master
:
$ git switch master
Switched to branch 'master'
that changes the attachment of HEAD
: it's now attached to master
, so now master..HEAD
means the same as master..master
, which means no commits, and that's what we'll see in git log
output:
$ git log master..
$
Selection with and without history
In most Git commands, the two-dot syntax A..B
"means":
- starting at commit
B
and working backwards, list out all the commits you can find this way, but then - starting at commit
A
and working backwards, list out all all the commits you can find this way.
We "paint" or "highlight" the first set of commits in green, and then "paint over" (or re-highlight) everything in the second set in red. The commits that are now highlighted only in green are the ones we've selected.
This works this way for git log
, git cherry-pick
, and some other Git commands. It takes a while to get used to. For instance, given the graph I have above, what does b1..b2
mean? Well, let's look at the graph:
D <-- b1
/
A--B--C <-- master
\
E <-- b2
(I took HEAD
out since we're not using it at the moment). We want to start at b2
—commit E
, or e683c...
in the repository I just made, and highlight that in green. Then we work backwards: E
leads backwards to C
, so we highlight C
in green, and move back another step to B
and highlight that in green, and move back to A
and highlight it green. Commit A
, being our root commit, doesn't have any commits before it, so that's where we stop highlighting in green.
Now we highlight commit D
—commit 51a61...
—in red. We weren't going to list it anyway, but we still "paint it red" for now. Then we move back one hop, to commit C
, which is the commit that comes before D
, and paint that one red. This "red paint" overwrites the earlier "green paint", so now we're not going to show commit C
. Then we move back another hop to B
and mark it in red, and so on.
The end result is that we show only commit E
. This is the same as master..b2
. The reason it shows up the same, even though master
selects C
while b1
selects D
, is that C
-and-earlier are the only commits that git log b2
would have selected. The de-selection process de-selects C
-and-earlier, regardless of whether we start it out at C
itself, or start it out at D
.
This gets us into another confusing thing in Git. The git log
command works by doing "selection with history". That is, we run:
git log b2
and Git uses branch name b2
to find commit E
and shows that commit, then works backwards. This shows us the entire history as found by starting at E
and working backwards.
Some Git commands, git log
being the most common one you'll use, do just this sort of thing: select with history. That is, you tell the commands "start here" and they do that, but then they also work backwards from there. For these commands, the range syntax b1..b2
or stop..start
or whatever keeps them from working all the way back to the beginning of time.
Other Git commands, such as git cherry-pick
, don't do this kind of "select with history". If you run:
git cherry-pick e683c472999de3d39c3e69d030b362df561b5711
Git will locate commit e683c472999de3d39c3e69d030b362df561b5711
(assuming you have my last commit here) and attempt to copy it to a new and supposedly improved commit, since that's what git cherry-pick
is for.2 Cherry-pick, by default, selects without history, and using stop..start
with git cherry-pick
makes it copy all the commits in the range.
So you end up needing to know: Does this command select with history or without history by default? But it turns out that as you use Git, this eventually "feels natural". Just experiment a bit the first time you use a new command: make a temporary "junk" clone and see what happens.
2Once any commit is made, it's unchangeable, so if you mean to change stuff other than "where the commit goes in the graph" or "the log message for that commit", you'll probably want git cherry-pick -n
, which means copy the work, but don't commit yet. Git being what it is, there are workarounds if you screw up here, and the main thing to remember is this: If you've committed something, it's now in Git and you can probably get it back as long as you still have the repository; if you have not committed it's not in Git and Git cannot help you get it back. So it tends to be a good idea to make a lot of small commits.
But git diff
is weird
One command we use a whole lot in Git is git diff
, and like git log
, you can run:
git diff A..B
The thing about git diff
is that it refuses to obey all the normal rules here. The two-dot syntax does not mean "every commit in the range". Instead, git diff A..B
means exactly the same thing as git diff A B
. The Git authors/maintainers have decided to encourage people to use the second form. It has another advantage as it's one character shorter. It does mean you can't use the implied HEAD
, because:
git diff xyzzy
means compare the commit selected by the name xyzzy
to the current working tree rather than compare the commit selected by xyzzy
to the current commit. You have to use:
git diff xyzzy HEAD
for the latter. It's a little annoying, really. It might be nice if git diff
just rejected the two-dot syntax entirely.