Home > Software design >  Checkout files from a specific commit failed with space in the path
Checkout files from a specific commit failed with space in the path

Time:02-25

What I want to achieve is to get all updates of a specific commit 76363636 from branch1 to branch2.

I used following codes instead of just checkout this commit, as it does not fit my purpose :

git checkout branch1 $(git diff-tree --no-commit-id --name-only -r 76363636)

This work fine when the file path in the commit does not have space in between. I used this code several times.

But not when there is space in one the file path , ex. below :

force-app/main/default/layouts/PersonAccount-Layout Professionnel.layout-meta.xml

I get following error :

error: pathspec 'force-app/main/default/layouts/PersonAccount-Layout' did not match any file(s) known to git

How to actually protect the file with double quote as I am getting the files dynamically

The below code won't work :

git checkout branch1 "$(git diff-tree --no-commit-id --name-only -r 76363636)"

CodePudding user response:

TL;DR

You want to set IFS temporarily while running the command:

save="$IFS"
IFS=$'\n'    # depends on shell
git checkout branch1 $(git diff-tree --no-commit-id --name-only -r 76363636)
IFS="$save"

or similar. You may have to use something other than $'\n' here. You might also consider using git restore rather than git checkout. See all the details below.

Long

This isn't a problem with Git—well, not exactly—but rather with the shell you're using. The fix is either to get the shell to behave better, or to bypass it entirely (see the git restore section below).

When you enter a command at the command prompt:

$ command with some arguments

it is the shell—the command-line interpreter—that breaks up the pieces here: command, with, some, and arguments each become a "word", and the shell then finds some executable somewhere on the system that is named command and runs it with all four "words" provided as four separate argv arguments. The argv[0] argument is typically either ignored entirely, or used to augment error messages in case the command is installed under a different name than expected (e.g., git-2.17 to run an old version of Git). The remaining arguments, in argv[1] through argv[3] inclusive in this case, are then interpreted by the program—but note that they have been pre-divided.

Should you wish to run, e.g.,

$ git restore --source=HEAD -SW "file with spaces"

you must use quotes (double or single) so that the shell invokes git with arguments restore, --source=HEAD, -SW, and file with spaces. Note how the enclosing quotes have vanished but the spaces are retained: there's a single argument containing two blanks.

The command:

git diff-tree --no-commit-id --name-only -r 76363636

itself is broken into six words, beginning with git and ending with 76363636. The shell runs that command and—because the command as a whole was enclosed in $(...)reads its output. The shell then interprets its output as a series of words, separated by white space: spaces, tabs, and newlines. The shell breaks up those words, and then run:

git checkout branch1 <word1> <word2> ... <wordN>

for all N broken-apart words.

Since it's the shell that is doing the breaking-up, it is the shell that you must overcome here. There is a way to do this.

The shell breaks words using $IFS

Bourne-derived shells use the internal field separator variable, $IFS, to determine what makes something a "word". The default IFS setting is space-tab-newline. This is a bit tricky to show, since space, tab, and newline all display as blanks or nothings on your screen.

To represent space, tab, and newline, we can use a literal space in quotes, the sequence \t for TAB, and the sequence \n for NL (newline). Some shells let you do this directly:

var=$' \t\n'

That's a single quote, preceded by a dollar-sign $ character: the text inside is then interpreted with backslash sequences handled similarly to their usage in the C programming language.

Some versions of the shell do not allow this; here, we can use the POSIX printf command:

printf " \t\n"

(here, either double or single quotes suffice).

What we want, of course, is for the shell to break only at newlines. Should you have files whose names contain embedded newlines, this won't work—but such file names are particularly evil and don't seem to be in common use, unlike Windows and macOS file names that do commonly have embedded spaces in them. If you're really aiming for bullet-proofing, you would want the -z option and NUL-terminated paths: ASCII NULs are the only character literally forbidden in pathnames on Linux (and hence in Git as well).

Having had the shell break our path names up at newlines, we should—for "shell hygiene" if nothing else—restore the $IFS setting afterward. For that, we can either set IFS back to space-tab-newline literally:

IFS=$'\n'
... do things using $(...) ...
IFS=$' \t\n'

Or, we can grab the current setting, set it to what we need, and then restore the old setting. This is nice for other users who may have set IFS for their purposes and don't want us destroying it:

f() {
    local save
    save="$IFS"
    IFS=$(printf '\n')
    ... our code ...
    IFS="$save"
}

Function f can now be called safely from some other shell function that also changes IFS temporarily, without having to keep setting IFS in this other shell function.

(Here, I've used printf instead of $'...', in case we have a shell that does not allow the $'...' usage.)

Using git restore

The git checkout command has a couple of defects here:

  • If the list of differences from git diff-tree is empty, we check out the entire branch. That's probably not what we want. This is a very big defect!
  • If some of the file names start with -, this git checkout will behave badly. We can fix this by adding --, which is probably a good idea.
  • If branch1 is not a valid branch name, git checkout will behave badly.

We can avoid all of these problems and solve the entire IFS-related problem by using git restore with its --pathspec-from-file and --pathspec-file-nul flags. (Be sure your Git version of git restore supports these flags; they first appeared in Git 2.25, while git restore itself first appeared in Git 2.23.) The -z flag in git diff-tree has been there far longer, so if your Git is at least 2.25 you have both.

What we do, then, is this:

git diff-tree <options> -z <rev> |
git --literal-pathspecs restore --source=branch1 --pathspec-from-file=- --pathspec-file-nul

The git diff-tree runs as usual but this time outputs un-encoded path names—this solves a number of problems you have not yet encountered—and terminates each pathname with an ASCII NUL. The git restore command then reads these pathnames (as pathspecs) from standard input. The --literal-pathspecs option to git itself tells git restore not to try to interpret pathspec magic in any of the input paths.

  • Related