Home > Back-end >  Is it possible to name different batches of staged files in git?
Is it possible to name different batches of staged files in git?

Time:04-30

For example, I have changes to a few files in my branch. I then stage them using git add.

Then I have other additional changes. I stage them again using git add.

Is it possible to name each staging? Thank you.

CodePudding user response:

The short answer is no.

The long answer is yes, but don't even try to do that, you'll just drive yourself crazy.

To understand why, let's look at how the "staging area" really works.

Git is about commits

Git is, on the whole, all about commits. It's not about files, though commits store files; and it's not about branches, although we organize commits into things we call branches, and we use branch names (which we also call "branches", even though they're quite different—see Haddock's Eyes for more about the confusion between things, names, and names of names) ... where was I? Ah yes, we use branch names to find commits, because the true name of each commit is a unique, but big, ugly, random-looking, too-difficult-for-humans-to-use hash ID or object ID (OID).

So each Git commit is numbered, with a unique number that gets expressed as a big ugly hexadecimal string. Git stores every commit—plus a bunch of supporting objects, which also get OIDs—in a big database of "all Git objects". Using the OIDs (hash IDs) as keys, Git can look up the objects nearly instantly in this key-value store, which is great, except for a few catches:

  • Nothing stored in the database can ever be changed.
  • A commit stores a full snapshot of every file (frozen in time, for all time). The files are de-duplicated across (and even within) commits to save space, which is perfectly OK to do since they're frozen for all time.
  • Each commit also stores some metadata, which we'll ignore for this particular answer.

So each commit acts like a permanent archive (tar or zip or whatever) of every file. But these files are stored in a special, read-only, Git-only, compressed and de-duplicated form. Only Git can read them and literally nothing on your computer, not even Git itself, can overwrite them.1

This means you literally cannot use the committed files! This might make you wonder what good they are.

The solution to this problem is simple, though: when you use git switch or git checkout to select a commit, Git reads the frozen copies of the files (which Git can do, even if nothing else can) and uses that to write usable versions of the files. These usable versions go into your working tree, as ordinary everyday files. If this is all Git did, things would be pretty straightforward and you wouldn't be asking this question in the first place, but that's not all Git does.


1If something does manage to overwrite one, the result is a corrupted Git database: you can no longer extract any commit that uses that overwritten file. So Git won't do that; Git tries to prevent other programs from doing that; and nothing else should try in the first place.


Git's index AKA staging area

The tricky bit is that when Git extracts a commit, so that you can use it—either just to read it, or to build a new commit later—Git copies all the frozen files out of the deep-freeze into what Git calls its index or staging area. (Git has a third name for this thing, calling it the cache, although these days that name doesn't get much use: it mostly shows up as flag names, like git rm --cached.)

What's in the staging area is technically the file's name and mode, plus a reference to the frozen, de-duplicated copy. However, you can think of it as if it held a rewritable version of the frozen copy, because that how Git uses it: when you run git add, Git:

  • reads the working tree copy of the file;
  • compresses it into the frozen form, but doesn't exactly freeze it just yet;2
  • checks to see if that's a duplicate; and
  • updates the index with the frozen form, either re-using the existing duplicate, or not, as appropriate.

If you add a new name for a file that was not in the index at this point, Git adds the new name and frozen-format contents to the index. Using git rm or git rm --cached will remove the name and reference as appropriate, too.

So the result is that when you first check out some commit, the index holds a full copy3 of that commit. As you run git add, the index copy gets updated. In effect, the index contains, at all times, your proposed next commit.

When you actually run git commit to make a new commit, Git takes the frozen-format index stuff and freezes it for real into a real commit. That new commit gets a new unique hash ID, and Git updates the current branch name to remember the new commit as the latest commit (Git arranges for the new commit's metadata to remember the previous branch-tip commit hash ID).

Now, the thing about the index is this: There's only the one index.4 So there's no way to say "this is the index at point X" and then, later, "this is the index at point Y": there's only one index, and git add overwrote the point-X index when you made point-Y exist.

The end result of all this is that, since there's only the one index, there's just one staging ara. There's no way to name it, because it's the index.


2Technically, Git does add a frozen-format blob object immediately to the objects database if necessary. If it never gets used, though, it eventually gets dropped. So it's as if it never got added here, which is why it's more "slushy" than "frozen".

3Since what's in the index is de-duplicated, and the files that came out of the commit are definitely already in the database, this copy takes no space. The index entries themselves do take a little space—on the rough order of about 100 bytes per file—but the file contents are all pre-de-duplicated.

4This is technically false for two reasons:

  • Most importantly, there's one distinguished index per working tree. Using git worktree add you can make a new working tree, and comes with a new index.
  • For Git's own internal purposes, it's possible to create and populate a temporary index. Doing so is pretty tricky, but git commit needs to do this when you use git commit --only or git commit --include. In fact, for git commit --only, Git needs to create two temporary index files. The ability to use temporary index files instead of the index is exported: you can do it yourself with scripts. But if you do, you must use extreme caution, as the index—the distinguished index for this particular working tree—must remain in the correct relationship with the current commit and working tree. If the relationship is not carefully maintained, the next git commit will commit the wrong files.

So, while it's possible to have more than one index file active, it's not something you want to do for a long period, such as a few seconds. When the git commit command creates its temporary index copies, they only last until the commit itself finishes.

  •  Tags:  
  • git
  • Related