Home > Mobile >  How to setup gitignore to track specific folders/files multiple levels down?
How to setup gitignore to track specific folders/files multiple levels down?

Time:09-14

I'm trying to set up a repository for multiple projects that uses Visual Studio.

I'm using a laptop and a desktop, but VS seems to have issues when stuff is on different drives (VS is installed on C drive for both computers, but on my desktop, my projects and external library are on the D drive while they are on the C drive for my laptop).

I want to track just the .cpp and .h source files, but I also want to keep the file structure so I don't have to manually move files around every time I pull from the repo, so I want to track the parent src folder and its parent (the project folder) as well. I don't want to have to add in another line to the gitignore for every project I add.

The files would be organized something like this...

-root
  -.gitignore
  -project1/
    -bin/
    -src/
      -main.cpp
      -foo.cpp
      -foo.h
    -project1.sln
    -project1.vcxproj
    -donttrack2.txt
  -project2/
    -bin/
    -src/
      -main.cpp
      -bar.cpp
      -bar.h
    -project2.sln
    -project2.vcxproj
  -donttrack.txt

I've tried looking at the documentation and another answer but I can't seem to get it working. Any help would be appreciated.

CodePudding user response:

In your gitignore, you can first specify to ignore all files

*/*.*
*/*/*.*
*/*/*/*.*

etcetera (do not do this with directories, though, because you cannot undo ignored directories)

Afterwards, you undo the ignoring of files in src folders:

!src/*
!*/src/*
!*/*/src/*

CodePudding user response:

The general rule is:

It is not possible to re-include a file if a parent directory of that file is excluded.

To exclude files (or all files) from the root folder of your repository, except .cpp, you would do:

*
!*/
!*.cpp
!*.h

Meaning: you need to whitelist folders first, before being able to exclude from gitignore files.

Double-check with git check-ignore -v -- path/to/file

CodePudding user response:

To make any sense of any of this, you need to realize two things:

  1. Git never "tracks" any folder.
  2. The word track has a very specific meaning and has very little—almost nothing, but a little bit more than entirely nothing—to do with .gitignore files.

The very name .gitignore is misleading. These are not files that Git will ignore. These are:

  • files that, if untracked, en-masse git add operations should not cause to become tracked;
  • files that, when untracked, git status should not complain about; and
  • files that, when they are to be clobbered by certain Git operations, Git should sometimes feel free to clobber them even if they would normally be considered "precious".

(The last case is complex. It refers to the situation that occurs when, e.g., you've typed in a git switch or git checkout command, and under Git's normal rules, this command would be rejected on the grounds that it will destroy some file content that Git has not saved anywhere, so Git won't be able to help you get the content back. This is mostly a problem when some file wasn't listed in .gitignore at one point, so that it became "tracked" at that point and got committed, and now is listed in .gitignore and isn't in almost all commits, but is in some old ones where it shouldn't be, and is a precious file like some kind of active database. You're unlikely to run into this case. There's no fix for it anyway. Multiple people have tried to fix this, including me, and nobody has succeeded yet.)

Let's address tracked file first because it's actually quite simple. You just have to buy into Git's world-view—which you have to do anyway if you are going to use Git, so read this next part thoroughly.

Git's index AKA staging area

The main bulk of almost every Git repository is a database full of Git commits and other objects. This database is, in effect, append-only:1 you add new commits to it. If something isn't right, you add another commit in which you've corrected whatever was wrong. There is literally no Git command to remove items from this database.2

The actual content of any commit is frozen for all time as soon as that commit exists. (Git needs this property to make its hashing scheme work; and Git needs its hashing scheme to make distributed version control work. It's not particularly uncommon for version control systems to have this read-only property in the first place, though.) So the files that are stored in a commit—every commit stores a full snapshot of every file, using clever sharing tricks to avoid storing the same content twice—aren't the files that you can actually see and edit. Instead, the committed files are a sort of archive, like a tarball or zip file or WinRAR or whatever.

This means that Git will extract the files from a commit for you to work on them. Most version control systems are like this: you pick a commit (or some version of some file) and the VCS extracts that commit (or that file) into a work area. So far, so what: this is the same as any other VCS.

Here's where things get weird and different in Git: when Git extracts all the files from a commit, it keeps a copy (in the internal, compressed-and-de-duplicated format) in an area that Git has three names for: the index, or the staging area, or (rarely these days) the cache.

So when you extract some commit, there are in fact three copies of each file, not the two you'd expect:

  • there's a frozen-for-all-time archived copy in the commit;
  • there is this weird exclusive-to-Git copy in Git's index / staging-area; and
  • there is a usable copy that you can see and work on / with.

When you choose to make a new commit, though, Git completely ignores the usable copy of each file. Instead, Git uses the index copy of each file.

This means that whenever you modify the working copy of a file, you must run git add on it. The git add command tells Git: compress the working copy into the internal form, and update the index copy. This doesn't overwrite the committed copy. Instead, it makes a temporary ready-to-go copy, or finds the existing duplicate. But either way the index copy was ready to commit before, and is ready to commit now: what's changed is the content in the index copy.

(Because there is this third copy, you can, if you like, use something like git add -p to store a different third copy, halfway between the committed and current copy. Some people like to use this trick to build commits on the fly, after doing a bunch of work. But that's just a side note.)

By adding an all-new-file to this index / staging-area, you can arrange for a new file to be in the next commit. By removing a file from this index / staging-area, you can arrange for the next commit's snapshot to omit that file. And by changing a file and replacing the index copy, you can arrange for the next commit's snapshot to have the updated file.

In all cases, the index is always ready to be used as the next snapshot.3 When you run git commit, Git takes the index's semi-frozen files, freezes them solidly, and puts them into the new commit. So the commit holds the files that are in Git's index—not the files in your working tree!

This leads us to the idea of a tracked file. A file in Git is tracked if and only if it is in Git's index right now. You can add files to Git's index with git add, and you can remove files from Git's index with git rm; you just have to also remember that, as you switch from one commit to another (by switching branch names for instance), Git will fill in Git's index from the commit.

An untracked file is therefore quite simple as well: a file that exists in your working tree, where you can see and edit it, is untracked if and only if that same file is not in Git's index right now. If it's not in Git's index right now and you run git commit right now, then it isn't in the new commit either, and that's basically all there is to it. You decide which files will be in commits, and which won't, by deciding which files are to be in Git's index right now—which you control, directly (git add, git rm) and indirectly (operations that check out commits or otherwise affect Git's index).


1There are, of course, some maintenance commands that can clean it out, or you can do the standard thing of "dump out old database, load new database excluding the parts we don't want to copy".

2The git gc and git prune commands are the maintenance commands alluded to in footnote 1. You don't tell them "remove object O" for some object ID, though: instead, you carefully arrange for object O to be unreferenced, then run git gc to discard the unreferenced objects. So it's a rather indirect way to do this. In general, you don't do it at all: you let Git run git gc --auto whenever Git thinks it would be profitable to do this kind of housecleaning. You let Git generate "garbage" (unused objects) on purpose, the way Git does in general, and you let Git clean up after itself when Git thinks it's smart, and that takes care of everything.

3During a conflicted merge, the index holds more than one copy of each conflicted file. At this time git commit or git write-tree will note that you can't make a commit, because the index holds conflicts. So it's a slight overstatement to say that the index always holds the next commit's snapshot, because sometimes you can't get Git to write out the index. But even in that case, the index does hold the next snapshot: you just can't make the snapshot.

  • Related