I understand you can use the NOT to allow an exception. But based on this,
It is not possible to re-include a file if a parent directory of that file is excluded.
Is there any way around this? For my case specifically, I have a folder called model_results
, with many subfolders. I want to ignore every single subfolder in model_results
, EXCEPT subfolders that are appended with _final
, one specific subfolder in those, called model
This is what I've tried to no avail:
# ignore subdirectories within model_results
model_results/**
# un-ignore these
!model_results/*_final/model
This still ignored regtree_final/model
If it is possible, I suspect the reason for mine failing is the way I'm using wildcards.
CodePudding user response:
Recursion: see recursion
It's important to understand, here, that Git uses a recursive search to find files. What does this mean? Well, the joke version is the section title here, but in fact, we start with an initial directory (or folder, if you prefer that term) to search:
func search(prefix string, d Directory) {
for element in (all files and subdirectories in d) {
skip = false
name = element.name
full_name = prefix name
type = element.type # file, directory, or "other"
switch type {
case type_file:
if not_in_index(full_name) and is_ignored(name, full_name, type_file)
skip = true
case type_directory:
if is_ignored(name, full_name, type_directory)
skip = true
case type_other:
skip = true
}
if skip {
// don't even look at this any more
continue
}
if type == type_file {
git_add(full_name)
} else {
subdir = opendir(full_name)
search(full_name "/", subdir)
closedir(subdir)
}
}
}
—except that the actual code is tremendously more complicated for various reasons. But the key is this: if we hit an ignored directory, we never look inside that directory! So if we ignore model_results/regtree_final
(the directory), we never see any of the files inside model_results/regtree_final
and therefore never add any of them. We never test to see whether they're ignored, or un-ignored, or whatever. We just never bother with the entire directory.
To make sure that we do look inside model_results/regtree_final
, we must arrange for is_ignored(name, full_name, type_directory)
to say "no, this is not ignored". So how do we do that?
Well, we can explicitly un-ignore the name regtree_final
, or the full name model_results/regtree_final
. That would require a line of the form:
!regtree_final
or:
!model_results/regtree_final
as a separate line in the .gitignore
file that occurs after the model_results/**
line.
Side notes
Side note: if we do ignore, say, model_results/blah
we don't need to carefully also ignore model_results/blah/zonk
because we'll never look at model_results/blah
in the first place and hence never test model_results/blah/zonk
. So as written, the **
is overkill. It's not wrong, it's just unnecessarily inclusive. Whether it will become right or necessary later is another question that you'll have to ask yourself later.
Secondary side note: I prefer to use the simple names in a .gitignore
, rather than names containing embedded or leading slashes. That is, instead of a top level .gitignore
, I'd rather have a file named model_results/.gitignore
in which I list the things to ignore that live within model_results
. This is a personal preference item and you may or may not be using some third-party software that prohibits this anyway, so it's up to you whether to adopt a similar preference. Just remember that there's an important difference between anchored and un-anchored entries in .gitignore
files and when you use a top-level .gitignore
to ignore entries in lower-level directories, all your entries for that lower level directory are by definition anchored: you have no choice here. When using a .gitignore
at the same level as the files, you have a choice. You might still want anchored entries, e.g., /*
and !/*_final/
, but you have the option of going either way, whereas with the top-level final, your entries are all anchored.
Un-ignoring some or all sub-directories
Now, I mentioned that to explicitly un-ignore regtree_final
you could use:
!model_results/regtree_final
Note that you can also write:
!model_results/regtree_final/
The trailing slash here means "apply this rule if and only if this name is a directory name". That's why the is_ignored
calls in the sample pseudo-code pass the entry's type. Rules like this, that end with slash, mean "match only if the type is directory".
The problem with un-ignoring regtree_final
like this is that it's necessary but not sufficient, as you want all *_final
directories un-ignored. You can achieve this with:
!model_results/*_final/
Here we've used the trailing slash to mean directories only, no files please and the leading !
to means un-ignore, i.e., do look inside this directory.
Now, if model_results/regtree_final/
contains another directory, e.g., if you have:
model_results/regtree_final/one/file1.ext
model_results/regtree_final/two/file2.ext
there's a problem with your model_results/**
line above, because *that line ignores model_results/regtree_final/one
.
If you have no sub-sub-directories, this isn't really a problem, but when you do get into deeply nested, "bushy" directory structures, it gets a little tricky.
One handy trick for Git ignore files is !*/
. This is an un-anchored expression, so it applies to all names found anywhere. But it's a trailing-slash expression, so it applies only to directory names ... and it's an "un-ignore" rule because it starts with !
. So it completely defeats the directory optimization that Git uses.
That is, when we hit the:
case type_directory:
if is_ignored(name, full_name, type_directory)
section of the code, we're always going to say "no, don't skip". Git will always look inside every directory, and slowly and painfully test every file in every directory. This is slow! It's a Big Hammer. Use with caution.