I have the following folder structure:
Project/
.git/
.gitignore #1
a/
a1/
a2.txt
a3.txt
.gitignore #2
b/
b1.txt
c.txt
I would like to have git
not ignore a2.txt
, and not ignore entirety of b/
. Everything else should be ignored.
Based on suggestions/comments/answers provided here, the content of .gitignore #1
is:
a/
c.txt
This essentially ignores c.txt
and everything in folder a/
, the latter being subject to not being overridden by a deeper nested .gitignore
.
The content of .gitignore #2
is:
!a1/a2.txt
I was hoping this deeper nested gitignore
file would lead to not ignoring file a2.txt
.
However, running git status --ignored
results in:
On branch master
No commits yet
Untracked files:
(use "git add <file>..." to include in what will be committed)
.gitignore
b/
Ignored files:
(use "git add -f <file>..." to include in what will be committed)
a/
c.txt
nothing added to commit but untracked files present (use "git add" to track)
That is, the entirety of a/
seems to be ignored despite the exception I was hoping would be provided by .gitignore #2
.
How can nested .gitignore
's be correctly used to achieve the requirement above?
(Note: I have only named the .gitignore
files in the description above as #1
and #2
for clarificatory purposes to differentiate between the two. In my actual computer, these files are properly named just .gitignore
.)
CodePudding user response:
The general rule here is this:
- Git will use the OS's facilities to read directories.
- To scan a directory, Git calls
opendir
and the associatedreaddir
(and eventuallyclosedir
) functions. readdir
then returns directory entries, one at a time. Each entry holds a name component as defined below. Entries may also hold additional information—in particular the directory vs file distinction—but that's as much as Git can really count on here. If the OS fills in ad_type
field withDT_DIR
,DT_FILE
, etc., Git will try to use that, otherwise Git may have to fall back to callinglstat
(which is expensive).
Having read the entire directory, Git now has a set of name components. A name component is basically the part of a path-name that goes between slashes: for instance, with path/to/file.ext
we have three components, path
, to
, and file.ext
. Note that the same is true for /path/to/file.ext
: the leading slash just means "from the top" rather than "from wherever we are in the tree". Git makes some (rather peculiar) use of this same idea—that paths starting with a slash are "root relative" and the rest are "current position relative"—when using "anchored" entries in .gitignore
files (see below). So if path/to/file
exists in the top level of a working tree, Git will see only the path
part when it scans the top level directory.
(Side note: POSIX also includes scandir
, but people find this interface hard to use correctly. It's also "more efficient" in various senses on some systems, although not always or very predictably, to use the lower level readdir
routines, and Git uses readdir
.)
Now that Git has the name components, Git can check them against this particular level's .gitignore
, if it exists. It can also combine each component with any leading path name that got Git here in the first place. For the initial scan there is no such leading component, and no combining happens, but let's observe below what happens if we are allowed to proceed into path/
(which is a directory).
The components may now need a type check: file vs directory. .Real-world file systems may have additional types, including symbolic link, but for our purposes here symbolic link is to be treated like a file for the moment. We just want to know whether component represents a directory.
Now, entries in any .gitignore
file that we have read so far—including the one in this directory that we're reading now—are flagged in three independent ways:
Some are anchored, as in
/path/to
ora/b
for instance, and some are not, as in*.o
for instance. An anchored entry is one containing any slashes after removing a single trailing slash if it exists.Some are for directories only and some are for all names. An entry is flagged as directory-only if it ends in a trailing slash. (Since the trailing slash is meant as the "directory only" flag, it has to be ignored while deciding whether to set the "anchored" flag.)
Some are positive ("do ignore") entries, and some are negative ("do not ignore") entries. A negative entry is one that starts with
!
as the first character. (An anchored negative entry for/path
would have to read!/path
;/!path
does not work here.)
So let's imagine that we're reading the top level, or that we're reading directory path
within the top level. Let's suppose we encounter two name components at this level: path
, and to
. We now check all of these things more or less at the same time (in order, so that "last entry" overrides):
Check the directory entry itself against all non-anchored ignore expressions. Is
path
a match for any of those? If so, this name is ignored/unignored as per the positive/negative flag.Check the full path so far against all anchored ignore expressions. For
path
this is/path
,/path/path
, or/to/path
; forto
, this is one of/to
, or/path/to
, or/to/to
. (Remember that we found both/path
and/to
and presumably we're looking inside both.) If this path-so-far is a match against one of the anchored expressions, this name is ignored/unignored as per the positive/negative flag.
Note that when we do check an anchored path, we're looking at the full path in the working tree, while the .gitignore
itself might be from a sub-path within the .gitignore
tree. So if we're reading directory /path
for instance and we have /path/.gitignore
and it has an anchored entry reading /xyzzy
, we're really checking this /xyzzy
against /path/xyzzy
(because it's from /path/.gitignore
, not from /.gitignore
). This is a little complicated, but makes sense once you think about it: the anchor is relative to the .gitignore
's location. This lets you rename directories without having to edit all anchored paths in any sub-.gitignore
files.
Note further that the "is a match" test may require that the directory entry itself name a directory. This is the case if the ignore entry is flagged as directories-only. So to check for that, we need to know if the entry—path
or to
for instance—names a directory in the OS's file system.
At this point, we have done all the checks we must do on this entry. It either matched some .gitignore
entry or entries, in which case the last matching .gitignore
is the one taken, or it did not. And, subdirectory .gitignore
s are matched later in the chain, so that the deepest .gitignore
that could match this entry will always have the last match, if it has a match.
If this entry did not match any .gitignore
rule, this particular name is not ignored. If it did match a .gitignore
rule, the last one's positive/negative flag determines whether this particular name is ignored or not.
Now that we know if the name is ignored, we have two options, each of which has two sub-options:
It is ignored:
- If it's a directory, we simply don't scan it at all.
- If it's a file, we don't auto-add the file (with
git add .
for instance), or forgit status
, we don't complain about the untracked file (assuming it is in fact untracked).
It is not ignored:
- If it's a directory, we scan it recursively and apply all these rules.
- If it's a file, we
git add
it (forgit add .
for instance) or make sure to complain if it's untracked (git status
).
This determines whether git status
complains about it being untracked (for git status
commands) or whether git add
of some sort of recursive flavor (git add --all
, git add .
, git add somedir
, etc.) adds it.
Note that you can override ignore entries with git add --force
, e.g., git add --force ignored-file
adds it even if ignored-file
would be ignored by the normal .gitignore
rules. I have never tried git add --force .
to see what happens here, but it's probably not good.