Home > Back-end >  What is the "base" of a git diff?
What is the "base" of a git diff?

Time:07-06

The XY question that I want to ask but suspect is not fully on topic is "How can I remember/memorise the various syntaxes of git diff?" The git diff syntaxes are (personally) confusing, and it's not clear to me how you can remember what they do intuitively (i.e. without rote memorisation), because the underlying logic doesn't seem clear.

But to break down the question further, my issue in clarity is this: what is the default base of a git diff?

Here is my understanding of the git diff syntaxes so far:

  • git diff: compares working directory with staging area
  • git diff HEAD: compares working directory with HEAD (current commit)
  • git diff branch/commit: as above, compares WD with branch/commit
  • git diff --staged: compares staging area with HEAD (?!)
  • git diff foo bar: compare foo and bar

The most "intuitive" and easy to understand syntax is git diff foo bar, or diffing a commit-ish. If "foo" is the "base" of the diff, then "bar" is the "comparison". The rest are not intuitive to me. For example I would have thought git diff --staged shows the difference between the working directory and the staging area, but that's not the case.

If I were to re-write these syntaxes using a file1 file2 syntax, I think it would look something like this:

  • git diff -> git diff WD..index
  • git diff HEAD -> git diff WD..HEAD
  • git diff commit/branch -> git diff WD..commit/branch
  • git diff --staged -> git diff index..HEAD

So my hypothesis is this: the base of a git diff is the working directory, always. Unless you specify --staged or --cached, the base changes to the staging area.

The question is:

  • Is this a correct interpretation of the logic behind the git diff syntax, and the "base" of the git diff?
  • Is there another more intuitive way to understand the logic behind the git diff syntax, rather than just memorising the commands?

CodePudding user response:

What you've written is basically correct, but your .. syntax is flipped,* and "base" is ambiguous so I'll use the terms "old" and "new". Here is a correct table:

  • git diff <commit1> <commit2>: general syntax (<commit1> is "old", <commit2> is "new")
  • git diff -> git diff INDEX WORKTREE
  • git diff <commit> -> git diff <commit> WORKTREE   (<commit> can be HEAD, a branch, etc.)
  • git diff --cached -> git diff HEAD INDEX   (--staged is a synonym for --cached)
  • git diff --cached <commit> -> git diff <commit> INDEX

Summary:

  • When you specify less than two commits, the default behaviour is to use the worktree as the "new" version. You can specify --cached to use the index as the "new" version.

  • If you don't specify the "old" version, Git will choose a sensible default (when comparing the worktree, "old" will be the index; when comparing the index, "old" will be HEAD).

One of the reasons for the special cases is that Git's revision syntax can only refer to commits – it can't be used to denote the worktree or the index. So the general syntax (git diff <commit1> <commit2>) can't be used to compare the worktree or the index.

The shortest forms (git diff and git diff --cached) are very convenient when using the following workflow:

  1. Make some edits
  2. Add them to the index
  3. Make some more edits
  4. Add them to the index
  5. Commit

After steps 1 and 3 (edits), you can use git diff to review the most recent edits before you add them to the index. This happens frequently, so it's nice that the command is short to type.

When you are about to reach step 5 (commit), you can use git diff --cached to view the changes in the index, which are about to be committed.

The direction can be confusing (e.g. git diff --cached <commit> means git diff <commit> INDEX, note how the order flipped), but in almost all cases it will be what you want: when you compare the worktree to something, you almost always want to treat the worktree as the "new" version (i.e. "What edits have I made since <...>?"), and likewise when you compare the index to something you almost always want to treat the index as the "new" version (i.e. "What changes are about to go into the next commit?"). In the rare cases where you want the reverse direction, you can use git diff -R ....


*The syntax git diff <commit1>..<commit2> is a synonym for git diff <commit1> <commit2>, i.e. the order is old..new. (Note that .. has a subtly different meaning outside git diff.)

CodePudding user response:

As tom answered, the term base isn't that great. I myself dislike old-and-new as well, because you can give git diff any arbitrary two commits to compare those, and they can both be "very old". Well, we could use old and older, perhaps

  • Related