Home > Back-end >  Why master branch change too before `commit` changed
Why master branch change too before `commit` changed

Time:12-07

  • sometimes I write some change in the second branch and I don't want to commit it yet,
  • where I get back to master I found the master branch changed too, why?
$ mkdir tmp && cd $_ && git init && touch file.txt && echo "text wite on master" > file.txt
$ git add file.txt && git commit -m"init"
  • checkout second branch
$ git checkout -b seconde-branch && echo "text wite on seconde branch" >> file.txt
  • switch back to master without committing the change
$ git checkout -b master 
$ cat file.txt

text wite on master
text wite on seconde branch
  • it's should be output:text wite on master only
  • also when I use git restore in master , I lost the change that writes in second-branch this make-me avoid using branchs, because it's confusing

CodePudding user response:

The files you see and work on, in your working tree, are not in any branch.

The way to understand this is to remember the following rules:

  1. Git is about commits.
  2. Git is not about files, although commits hold files.
  3. Git is not about branches, although branch names help us (and Git) find commits.

What Git cares about—what Git stores and transmits to other Git repositories—are the commits.

Each commit:

  • is numbered: every commit has a big, ugly, random-looking, unique hash ID. The hash ID of some commit is how Git knows that that commit is that commit. Your Git will present this hash ID to some other Git software; if that other Git software, working with its repository, has a commit with this number, it has this commit. If it does not have this number, it needs to get this commit from your Git repository, if your Git is offering it. (The same goes in the other direction, when you have your Git add new commits to your repository, obtained from some other Git repository.)

  • is read-only: no commit can ever be changed, not even by Git itself.

  • stores two things: a full snapshot of every file that Git knew about at the time you, or whoever, made the commit; and some metadata, or information about the commit, such as who made it and when. Note that the files stored in a commit are kept in a special, read-only, Git-only format, compressed and de-duplicated. Your computer can't read these files (well, it can read the raw data, but it can't make sense of it) and nothing—not even Git itself—can overwrite these files (because of the same hashing scheme that's used for commits).

Because commits and their files can only be read by Git itself, Git has to extract a commit before you can do any work with it. This is what git checkout does: it extracts the commit.

When you switch from one branch to another—whether with git checkout or the newer git switch—you may be telling Git to switch from one commit to another commit. In this case, Git has to remove the files that came out of the commit you were using, and replace them with files that came out of the commit you will be using. Before Git does this, though, it checks to make sure that any files it removes-and-replaces aren't actually modified. That way you won't lose work you've done but have not yet committed.

If you have done work, and haven't committed it, the work you have done so far is not in Git. It is merely in the files Git extracted earlier, that you changed since then. So switching from one branch to another will not show anything, because those files aren't in Git.

This whole system can be pretty confusing, so let's say a bit more.

Commits are numbered, and link to earlier commits

Whenever you, or anyone, make a new commit, the new commit gets a new, unique, random-looking hash ID. That hash ID is—and must be—different from the hash ID of every other commit everywhere in the universe.1 That new commit gets written out, and from then on, it can never be changed.

The new commit, as it's being written out, can have the hash ID of some older commit stored inside it. This makes the new commit "point to" the older commit. Once we've repeated this trick a few times, we have a chain of commits. If we call them by single uppercase letters—this is easier for humans to understand—we get a drawing that looks like this:

A <-B <-C

where C is the latest commit. We say that commit B, which came just before C, is C's parent. Commit C points to its parent B. Commit C also holds a full snapshot of every file. Commit B, of course, is also a commit and holds a full snapshot of every file and points to its parent A. Commit A is a commit and holds a snapshot, but since commit A is the first commit, it can't point backwards to any earlier commit, so it just doesn't.

By starting with the latest commit and working backwards, Git can find all the commits. So we only need to remember the last commit's hash ID.


1This is technically impossible, and someday Git will fail to work. The large size of the hash IDs tries to put that day so far in the future that we don't care that it won't actually work forever: we'll all be long dead before there's a problem. At least, that's the idea, but we're already running into a few other minor issues, so Git is getting a new even-bigger hash ID scheme.


Branch names help us (and Git) find commits

The system above works fine as long as we remember the actual hash ID of commit C. But who can remember some big ugly hexadecimal number like that? I can't, and you probably can't. We could write these down, perhaps ... but hey, wait a minute, we have a computer. Let's have the computer store the number of the latest commit. We'll put it in a small database of names. Let's call them branch names and tag names and the like.

Now that we have names, we can add them to our drawing. Each name points to some commit:

A--B--C   <-- master

Here, the branch name master points to commit C. Let's add another branch name, seconde-branch, that also points to commit C, like this:

A--B--C   <-- master, seconde-branch

We now need a way to remember which name we are using. Let's use the special name HEAD for this:

A--B--C   <-- master (HEAD), seconde-branch

This indicates that we are using commit C as our current commit, via the name master. If we now:

git checkout seconde-branch

we get:

A--B--C   <-- master, seconde-branch (HEAD)

We're still using commit C, but now we're using it via the name seconde-branch.

When we change branches like this, we're not changing which commit we're using. So Git does not have to remove-and-replace any files at all, and therefore, Git doesn't bother. This lets us switch to the other branch, in case we forgot and started editing files too soon.

Git's index and your working tree

As I mentioned above, when we first check out or switch to some branch, Git will—if needed—extract all the files from the snapshot in the commit as found by the branch name. These files are in some weird Git-only format, compressed and de-duplicated, but now they're regular everyday files.

These files go in a work area. Git calls this our working tree or work-tree. The files here came out of a commit but are not actually in Git: they're just ordinary files in ordinary folders. Git has no control over these files: you can do anything you want with them.

When you have done something with them, you'd typically like to save the things you did. For this purpose you'll need to make a new commit. In other version control systems, you'd run their commit verb (e.g., hg commit or svn commit) and they'd scan your working tree, find what you changed, and make the new commit. Git, however, is different. Git makes you run git add.

What git add does is copy the updated file back into a secret—well, not really secret, but invisible—Git area that, in effect, sits between a commit and your working tree. This area is extremely important in Git, at least if you ever plan to make any new commits. (If you don't need new commits, you can mostly ignore it.) Because it is so important, and/or because it is badly named, this area has three names: Git calls it the index, the staging area, and—rarely these days—the cache.

(You can—to a limited extent—get by with git commit -a instead of git add. Don't do this! You'll be able to ignore the index for a while, but eventually, Git will whack you over the head with its index. Learn about the index. Embrace it. Some people find it useful: there are clever tricks you can do with it. Some find it annoying, but it's there, in the way, and you need to know about it so you don't trip over it.)

Git's index is a complicated thing, but it plays one pretty constant role, and can therefore be described in one line this way: The index holds your proposed next commit. The initial git checkout or git switch that you run extracts the commit's files to Git's index.

The files in Git's index are in the compressed and de-duplicated form that Git uses internally. The key difference between these files, and the files in a commit, is that the commit cannot be changed, but the index contents can be changed. Running git add tells Git: Make the index copy of this file look like the work-tree copy.

What this means is that after git add, you've updated your proposed next commit. When you first check out commit C, or are on commit C with modified working tree files like this:

A--B--C   <-- master, seconde-branch (HEAD)

the index still holds the original files that were extracted from commit C. Until you run git commit—which will write out the index's files into a permanent form in a new commit—the index copies are just sitting around ready to go into a new commit.

Running git add updates the index copies, making them match the working tree copies. So this means that with, e.g., file.txt, there are three copies:

  HEAD         index      work-tree
---------    ---------    ---------
file.txt     file.txt     file.txt

As you modify the work-tree copy, nothing happens to the other two copies. If we put version numbers in the table above, we get:

   HEAD           index        work-tree
-----------    -----------    -----------
file.txt(1)     file.txt(1)   file.txt(2)

When you run git add file.txt, Git updates the index copy to match the work-tree copy:

   HEAD           index        work-tree
-----------    -----------    -----------
file.txt(1)     file.txt(2)   file.txt(2)

Note that you can change the work-tree copy again, without using git add, and at this point all three copies will differ.

(Note that if you run git add on a new file, that isn't in the index yet, Git will copy this new file into the index. This adds the new file to the proposed commit. It's not yet in any commit, but it's now ready to be committed. Or, you can run git rm on a file to remove it from both the index and your working tree. Now it's gone from the index, so it won't be in the next commit. This does not affect any existing commits: those cannot be changed.)

When you run git commit, this is what happens:

  1. Git gathers any metadata it needs, such as your name and email address, and the current date-and-time. It may collect a log message from you, or use the -m argument to get the log message.
  2. Git uses the current commit's hash ID to go into the metadata for the new commit.
  3. Git writes out whatever files are in the index.
  4. Git turns the above into a commit. This creates the commit's unique hash ID. (One reason for the date-and-time-stamp is that since this is always changing, the hash ID will differ from that of any other commit that is otherwise exactly the same.)
  5. This is the sneaky bit. Now that the new commit exists, Git writes the new commit's hash ID into the current branch name.

This means that if you currently have:

A--B--C   <-- master, seconde-branch (HEAD)

and you run git commit and it successfully makes a new commit, you now have:

A--B--C   <-- master
       \
        D   <-- seconde-branch (HEAD)

Note how master still points to commit C, but seconde-branch now points to new commit D. Commit D points back to existing commit C as its parent. No commits have changed, but there is now a new commit in the repository.

If you now run:

git checkout master     # or git switch master

Git must now remove the commit-D files and replace them with the commit-C files. Git has to do this for all the files that are different. It can cheat a bit, and for any file that is the same in commits C and D, it can leave that file alone in the index and working tree.

(A degenerate case of "leave the file alone" occurs when switching from commit C to commit C: there's no change at all, so all files can be left alone. That's the case you're seeing in your example, and it's always true for the kind of git checkout -b you are using.)

But if some files are different, Git will have to remove-and-replace those. Here Git will first make sure that these files in your working tree aren't changed; if they are, Git will refuse to switch commits. You can force the switch anyway, telling Git throw away my changes. (Since those changes are in your working tree, which is not in Git, Git will not be able to help you recover from this. So don't ignore Git's complaints about files that would be overwritten. Figure out why you haven't saved them!)

Let's switch to commit C again:

A--B--C   <-- master (HEAD)
       \
        D   <-- seconde-branch

We can now make a new commit on master, by changing some files, or adding new files, or removing files, or some combination of all three. We git add any updates if necessary and run git commit, and after it succeeds, we have a new commit, with a new big ugly hash ID, but we'll just call it E:

        E   <-- master (HEAD)
       /
A--B--C
       \
        D   <-- seconde-branch

Now, think about this: Which commits are on which branches? In particular, which branch(es) hold commits A-B-C? Remember that Git always starts with the last commits—of which there are now two, commits D and E—and works backwards.

Working backwards from E, we also traverse commits C, then B, then A. So these commits should all be on master. Working backwards from D, we also traverse C, then B, then A. So these commits should all be on seconde-branch.

So: which branch(es) are commits A-B-C on? I'll leave this as an exercise, but will note that Git's answer is very different from, say, Mercurial's.

CodePudding user response:

Uncommitted changes are not bound to any branch.

They are just local changes in your local file system.

When you switch between branches, Git will usually preserve theses local changes (unless you discard them by doing a force-checkout).

If you want to store changes for later, but without committing them to any branch, use git stash save -u to save your local changes to your local git stash.

You can then restore them later using git stash apply

  •  Tags:  
  • git
  • Related