Home > Software design >  using sed in git filter-branch to replace path in xml files
using sed in git filter-branch to replace path in xml files

Time:10-15

I am trying to replace path using sed in git filter-branch command. I have three similar files DOM.xml in 3 different folders. example:

SAM20/sam/DOM.xml
SAM21/sam/DOM.xml
SAM22/sam/DOM.xml

Content in the above three DOM.xml are different. Getting error after trying with the below command.

Proceeding with filter-branch...
Rewrite ce96e44a942bfdc26bd8aa6fa4407b4a88965bca (1/644) (0 seconds passed, remaining 0 predicted)
sed: can't read SAM22/sam/DOM.xml: No such file or directory
tree filter failed: sed -i -e 's|./old/path/in/sam|./new/path/in/sam|g' SAM22/sam/DOM.xml

CodePudding user response:

Side note: you should show the actual git filter-branch command itself, not just its output. Fortunately we can see from the error message that you are using the --tree-filter option (which is the right option here).

The problem is almost certainly the fact that the first commit in the repository simply does not have these files. Your command requires that the file SAM22/sam/DOM.xml exist when that particular commit is checked out, and the first commit might have just a README, for instance, or lack the SAM22/sam/DOM.xml file while having SAM20 and/or SAM21 files.

If it's OK for several files to be missing, use a sequence that does nothing when the files are missing, and edits the files when they exist. For instance:

if test -f SAM22/sam/DOM.xml; then
    sed -i -e 's|./old/path/in/sam|./new/path/in/sam|g' SAM22/sam/DOM.xml
fi

To keep yourself sane, you might wish to build the tree filter into a shell script that you can run with sh /tmp/script or similar, so that your Git command reads:

git filter-branch <various options> --tree-filter "sh /tmp/script" <more options>

This will be a little bit slower than including the entire set of commands in the command line, but --tree-filter is already so miserably slow that this probably won't be much worse. To make things go much faster, consider using git filter-repo instead of git filter-branch.

CodePudding user response:

If your tree's got anything like a lot of files, you'll get better to much, much better results with an index filter, something like

for f in $(git ls-files ":(glob)SAM2[012]/sam/DOM.xml")
do
        updated=$(
                git show :$f \
                | sed "s,path/to/old/locn,path/to/new/locn,g" \
                | git hash-object -w --stdin --path=$f
        )
        git update-index --cacheinfo 100644,$updated,$f
done
  • Related