Home > Net >  git reset --hard does not seem to be reseting uncommited changes
git reset --hard does not seem to be reseting uncommited changes

Time:11-15

I am new to Git and I have made some unwanted changes like this:

capture

Now I don't need any of these, so I tried this:

git reset --hard

But it still shows them!

So how can I reset this and get back to where I haven't created any of these unwanted files?

CodePudding user response:

An untracked file:

  • did not come out of the current commit (see possible exception below, in the long description);
  • is not in the proposed next commit; and
  • is lying around in your working tree.

As such, the file is not in Git now, it will not be in Git tomorrow, and so no Git operation will touch it.

If you don't want these files to be in the next commit, there is nothing else you need to do.

Long: What's going on here

Git isn't about files. Git is about commits.

Those new to Git often think it's about files, but it's not. Commits do contain files, but Git is about the commits. They might also think that Git is about branches, but that's not really true either: branch names do help you (and Git) find commits, but Git is still all about the commits.

Since Git is all about commits, you are required to know about them. Here's what you need to know right now:

  • Each commit has a number. These aren't simple counting numbers: we don't have commit #1, followed by #2, then #3, and so on. Instead, each commit gets some really huge, big and ugly number, seemingly random, like 5a73c6bdc717127c2da99f57bc630c4efd8aed02 for instance.

    This number is unique to this one particular commit. Every commit everywhere, no matter who makes it, when, where, etc., has to get its own unique number. This is why the number has to be so huge. The number is actually the output of a cryptographic hash function.

  • Each commit stores two things:

    • Every commit stores a full snapshot of every file. This is the main data part of a commit. The files inside the commit are in a special, compressed, Git-only and de-duplicated form: they're not ordinary files at all (and Git can store files that some systems such as Windows can't extract, which creates problems for Windows users). This acts as a permanent archive, like a tar or zip file of all the files in the commit.
    • And, every commit stores some metadata: some information about the commit itself, such as who made it and when. This metadata lets Git tie newer commits back to older ones, which is how Git gets a lot of its work done.

Due to the cryptographic numbering scheme, no part of any commit can ever be changed. Once you've made a commit, it's frozen that way forever. (If you make a bad commit by mistake, you can just stop using it. It sticks around for a long time, in case you want it back, but eventually Git will figure out that not only aren't you using it, but—under the right conditions—nobody else ever will be able to find and use it either and therefore Git can remove it entirely. But that's a trickier matter that we won't worry about here.)

But if a commit is read-only (and it is), and the files inside a commit are stored in a format that only Git itself can read (and they are) and literally nothing can write, how will we ever get any actual work done? Git has the same answer here as all version control systems, which all share this kind of problem. You don't work on the files that are in Git. You work, instead, on copies that Git takes out of Git.

Git extracts these usable, workable copies of your files on demand, when you check out a commit with git checkout or git switch. The usable files go into a work area, which Git calls the working tree or work-tree. This is pretty simple and straightforward: your working tree is where you get your work done. It has files you can see and use. But these files are not in Git.

Making new commits

Other (non-Git) version control systems start out this same way: you extract a commit, and that gets you useful files. Then you edit the files as needed and when you're ready, you run, e.g., hg commit (for Mercurial, a different version control system). This non-Git VCS figures out what you did to the files and makes a new commit and you're all set.

Git makes things much harder. Instead of reading your working tree when you run git commit, Git sets up a separate thing. This thing has three names, perhaps because the first names were terrible ones. The three names are:

  • the index: this name doesn't mean anything, which has some good and bad aspects; I tend to use this one myself;
  • the staging area: this name reflects how you use the thing, and is perhaps the best name; and
  • the cache: this name is not so good, and Git mostly avoids it now, but it lingers in flags like git rm --cached.

This thing—the index or staging area—holds your proposed next commit. When you first check out some commit, Git fills it in with copies (or "copies", because they're de-duplicated already: the index "copy" is in Git's internal format) of all the files that are in the commit you just checked out. These files also go into your working tree (as real, ordinary files, rather than weird Git-ized de-duplicated magic).

When you modify the working tree copy of some file, that just changes the working tree copy of that file. No Git file has changed anywhere. The proposed next commit still holds the previous version of the file, the one Git extracted from the current commit.

If you want Git to commit the updated copy, not the old copy you took out of the earlier commit, you must tell Git to update its index / staging-area. You do this with git add, e.g., git add file.ext. This tells Git to read the working tree version of file.ext, compress it into Git's internal format, arrange for the de-duplication as appropriate, and get it all ready for the next commit. This next-commit-ready copy of the file goes into the index / staging-area and now you've updated the proposed next commit.

What all this means is that there are, at all times, three copies of every file (although some of them are Git-ized and hence de-duplicated):

  HEAD         index      work-tree
---------    ---------    ---------
README.md    README.md    README.md
main.py      main.py      main.py

for instance, if you have three files in the current commit. The HEAD copies are read-only (and de-duplicated); the index copies are replaceable (but also de-duplicated); and the work-tree copies are usable, ordinary files that aren't de-duplicated but let you get work done.

Running git commit makes a new commit from the copies that are in the index. So that's why these copies exist: to be ready for the next commit.

Untracked files

Your working tree is just an ordinary directory (or folder, if you prefer that term) on your computer. Because it is an ordinary folder, you can create new files here, or remove existing ones. Git won't know or care that you did so: the proposed next commit is hidden away in Git's index (or staging area). It doesn't have the new files:

  HEAD         index      work-tree
---------    ---------    ---------
README.md    README.md    README.md
main.py      main.py      main.py
                          new.txt

Any newly-created files that are in your working tree right now, but aren't in Git's index right now, are untracked files. That's how untracked file is defined, in Git: an untracked file is a file that does exist in your working tree, but doesn't exist in Git's index.

If you run git add on one of these untracked files, Git will read it, compress it into the internal format, check for duplicates, and so on, and add the new file to Git's index, making it staged for commit. Now the file does exist in Git's index (and also in your working tree) and so it's no longer an untracked file. By changing the set of files in Git's index, you've changed which files are untracked:

  HEAD         index      work-tree
---------    ---------    ---------
README.md    README.md    README.md
main.py      main.py      main.py
             new.txt      new.txt

Similarly, you can use git rm --cached to remove a file from Git's index, but leave it in your working tree. This causes a file that was tracked to become untracked. (That's the special exception I mentioned earlier: if a file was in the commit you checked out, but then you removed the index copy, it's now untracked.)

Again, all this updating happens in Git's index, which you can't see. There's no obvious place to look for the files in Git's index.1 But if Git were totally silent about this, though, it would make Git even harder to use than it already is. So git status will report untracked files.

Running git status actually does a bunch of things:

  • First, it has Git print out your current branch name, and some other useful information.

  • Next, it has Git compare the current commit, i.e., HEAD, to the index. For all the files that match, git status says nothing at all. For any file that's different, git status says that this file is staged for commit. That means the index copy of the file is different.

  • Then git status has Git compare the index files to the working tree files. For all the files that match, git status says nothing at all. For any file that's different, git status says that this file is not staged for commit. That means the working tree copy of the file is different.2

In all of this, though, git status ignores any untracked files. It does collect up their names: when Git runs the second comparison, any untracked files show up, and Git now has a full list of the untracked files.

Now git status will show the untracked files, but now the .gitignore file kicks in, if you have one.


1You can run git ls-files --stage to dump out what's in Git's index. This isn't meant for everyday work though, and it's not a good way to get stuff done.

2Note that all three copies can be different. For instance, check out some existing commit that has a README.md. Add a line to the file and run git add README.md. Then add a second line to the file. Now all three copies are different. Try git status. The file is both "staged for commit" and "not staged for commit". That just means HEAD-vs-index shows the first added line, and index-vs-working-tree shows the second added line.

If you run git commit now, Git commits the index copy of each file. So the README.md file in the new commit has just the one added line. If you run git add README.md now instead, Git replaces the index copy, with the one added line, with a new index copy with both added lines.


Untracked files can be "ignored"

If some file is untracked, git status would normally list it in that section you don't like in your output that shows files like app/Popup.php. The point of listing this file here is to alert you that you have not yet added the file, and it won't be in the next commit until you do.

But what if that file is not supposed to be in the next commit, or even in any commit? Well, one answer to that problem is that you can remove the file right now:

rm app/Popup.php

Now it's no longer in your working tree. (Note that this removal was not done by Git, or even for Git, it's just something you did on your own.) Since this file was never in Git, it's now gone for good—at least as far as Git goes. Git can't help you get it back. It was never in Git!

But: maybe you don't want to remove it. Maybe you'd like to tell Git two things at the same time:

  • Hey Git, stop complaining about app/Popup.php!
  • Hey Git, don't go adding app/Popup.php to your index! It shouldn't be committed!

To do both of these things, you can list the file in a .gitignore file:

echo app/Popup.php >> .gitignore

or:

echo /Popup.php >> app/.gitignore

(these will have the same effect here).

Listing a file, or a glob pattern like *.o or *.pyc, in a .gitignore file, tells Git: stop complaining about this file / files that match this pattern when they are untracked. So that makes git status more useful: it only warns you about files that should be tracked now.

It also stops git add from adding the file. You can force git add to add the file, but by default, git add won't add the file to Git's index now.

None of this has any effect if the file is already in Git's index. So .gitignore is not really a list of files to ignore. It's really .git-don't-complain-about-these-files-if-they-are-untracked-and-do-not-add-them-to-the-staging-area-unless-I-force-it-because-they-are-supposed-to-stay-untracked-okay?, or something like that. But having that as the file name would be crazy, so Git just calls it .gitignore.

The bottom line

Untracked files won't be in your next commit. Saying "what do I do about these files that I don't want in my next commit" is pointless: they already won't be in your next commit. Asking how to make git status more useful, by having it not list the files, is useful. It's really pretty straightforward though: you just need to get used to Git's weirdness about having three copies of each file at all times, and that the name .gitignore is misleading: the files aren't ignored at all, they're just silently untracked as long as they are actually untracked (which is hard to tell without using the debug-like git ls-files).

CodePudding user response:

As git status tells you, those files are untracked. That means they are outside of Git's purview. Therefore git reset --hard does nothing to them.

The command for telling Git to get rid of untracked files is git clean. It comes with various options; for example, it won't recurse into folders without being to do so, so perhaps you want git clean -d.

(This is a dangerous command — saying it at the wrong time can basically erase your whole hard disk — so I strongly recommend you say git clean -d -n to do a dry run and see what will actually happen first.)

  • Related