Home > Enterprise >  Remove all files from Git repo history with path having escape \ in filename with git filter-repo
Remove all files from Git repo history with path having escape \ in filename with git filter-repo

Time:01-21

I have special filenames with escape \ characters stored in Git repository on Debian 10 Linux.

Problem: it is not possible to git checkout files on Windows, which have incompatible characters in the filename.

Example:

git log --all --name-only -m --pretty= '*\\*'
"systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
"systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"

I get following Git errors at Windows checkout:

C:\Git\bin\git.exe reset --hard "5ef1cac3a03304c35b455edf32bd1bb78060c5b9" --
error: invalid path 'systemd/system/default.target.wants/snap-git\x2dfilter\x2drepo-7.mount'
fatal: Could not reset index file to revision '5ef1cac3a03304c35b455edf32bd1bb78060c5b9'.
Done

Problem reproducing steps:

# Clone repository, to be executed on a safe repo:
git clone --no-local /source/repo/path/ /target/path/to/repo/clone/
# Cloning into '/target/path/to/repo/clone'...
# remote: Enumerating objects: 9534, done.
# remote: Counting objects: 100% (9534/9534), done.
# remote: Compressing objects: 100% (4776/4776), done.
# remote: Total 9534 (delta 4215), reused 8043 (delta 3136), pack-reused 0
# Receiving objects: 100% (9534/9534), 7.41 MiB | 16.78 MiB/s, done.
# Resolving deltas: 100% (4215/4215), done.

cd /target/path/to/repo/clone/

# List the files with escape \ from repo history into a list file:
git log --all --name-only -m --pretty= '*\\*' | sort -u >/opt/git_repo_files_w_escape.txt

# Remove the files with escape \ from repo history:
git filter-repo --invert-paths --paths-from-file /opt/git_repo_files_w_escape.txt
Parsed 592 commits
New history written in 0.25 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 71128f3 .gitignore: ADD snap-git to be ignored
Enumerating objects: 9354, done.
Counting objects: 100% (9354/9354), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3694/3694), done.
Writing objects: 100% (9354/9354), done.
Total 9354 (delta 4085), reused 9354 (delta 4085), pack-reused 0
Completely finished after 0.55 seconds.


# List files with escape \ to check result:
git log --format="reference" --name-status --diff-filter=A '*\\*'
# "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"

#  Unfortunately it seems filter-repo was executed, but log still lists filenames with escape \ :-( 

Question:

1) How to remove all files from Git repo history with path having at least one escape \ character in filename?

(reason: it is not possible to checkout those files on Windows, which have incompatible characters in the filename)

UPDATE1:

Tried to replace \\x2d string to - in input file list as suggested, but git history remove was still unsuccessful:

# List the files with escape \ from repo history into a list file:
git log --all --name-only -m --pretty= '*\\*' | sort -u >/opt/git_repo_files_w_escape.txt

# Replace \\x2d string to - in git_repo_files_w_escape.txt:
sed -i 's/\\\\x2d/-/g' /opt/git_repo_files_w_escape.txt

# Remove the listed files from repo history:
git filter-repo --invert-paths --paths-from-file /opt/git_repo_files_w_escape.txt
Parsed 592 commits
New history written in 0.25 seconds; now repacking/cleaning...
Repacking your repo and cleaning out old unneeded objects
HEAD is now at 71128f3 .gitignore: ADD snap-git to be ignored
Enumerating objects: 9354, done.
Counting objects: 100% (9354/9354), done.
Delta compression using up to 8 threads
Compressing objects: 100% (3694/3694), done.
Writing objects: 100% (9354/9354), done.
Total 9354 (delta 4085), reused 9354 (delta 4085), pack-reused 0
Completely finished after 0.55 seconds.


# List files with escape \ to check result:
git log --format="reference" --name-status --diff-filter=A '*\\*'
# "systemd/system/default.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/multi-user.target.wants/snap-git\\x2dfilter\\x2drepo-7.mount"
# "systemd/system/snap-git\\x2dfilter\\x2drepo-7.mount"

#  Unfortunately log still lists filenames with \\x2d :-(

UPDATE2:

Tried to replace \\x2d in git_repo_files_w_escape.txt to \\\\x2d or \x2d but none of them resulted to remove the files having \\x2d in filename from Git history.

UPDATE3:

I'm looking for a working solution based on git filter-repo.

Any more idea? Is a bug?

CodePudding user response:

fwiw, this worked on a linux system, this allowed me to rewrite the HEAD commit without having the files checked out on disk:

git ls-files | grep -a -e '\\' | while read f; do
    f=$(echo $f | sed -e 's|"||g')
    new=$(echo "$f" | sed -e 's|\\\\x2d|-|g')
    git show "@:$f" > $new
    git rm --cached "$f"
    git add "$new"
done

git status
git commit --amend

The same commands should work on git-bash for windows.

CodePudding user response:

Assuming you have many files that you want to fix scattered in the hierarchy, a solution with git filter-repo looks tedious. You can instead use a combination of git fast-export and git fast-import to modify file names in the whole history.

git fast-export --no-data --all > exported

Now delete the file entries containing a backslash:

grep -v '^[DM] .*\\' exported > fixed

Instead of removing the files, you can also modify the file names. For example, to replace the backslash by a dash -, you could try this:

sed -e '/^[DM] /s,\\,-,g' < exported > fixed

You may now investigate the difference between the two files to ensure that no commit messages were modified:

diff -u exported fixed | less

Now attempt to import the modified history:

git fast-import < fixed

This will stop with an error that tells you that the branches will not be modified because the old branch heads are not subsets of the new heads. If there are no other errors, you can now force the modification:

git fast-import --force < fixed
  • Related