Home > Blockchain >  How does ANSI C-Quoting in Herestrings work?
How does ANSI C-Quoting in Herestrings work?

Time:02-26

Why is this not working?

bla="
multi

line

string
"
cat -A <<EOF
${bla//\$'\n'/\\\$'\n'}
EOF

this works:

cat -A <<EOF
$(cat <<<${bla//$'\n'/\\$'\n'})
EOF

as noted in a comment this also works:

newline=$'\n'
cat -A <<EOF
${bla//$newline/\\$newline}
EOF

Expected:

\$
multi\$
\$
line\$
\$
string\$

I guess it has something to do with Quoting and how heredocs work. But as set -x is not working for parameter expansion debugging, i could not figue it out.

A link to where it is explained in the GNU Bash Reference Manual would also help.

Note: The $ in Expected: are from cat -A option and stand for the newline \n character.

CodePudding user response:

I think you need to remove some of the \ in the parameter expansion.

That is, it should read ${bla//$'\n'\$$'\n'}.

$ bla="
multi
line
string
"
$ printf "%q\n" "$bla"
$'\nmulti\nline\nstring\n'
$ gcat -A <<EOF
${bla//$'\n'/\$$'\n'}
EOF
$
multi$
line$
string$
$

What I have adds a $ at the end of each line.

$ bla="
multi
line
string
"
$ printf "%q" "$bla"
$'\nmulti\nline\nstring\n'
$ printf "%s" "${bla//$'\n'/\$$'\n'}"
$
multi$
line$
string$
$ printf "%q" "${bla//$'\n'/\$$'\n'}"
$'$\nmulti$\nline$\nstring$\n'

I am using

$ echo $BASH_VERSION
5.1.16(1)-release

CodePudding user response:

Why is this not working?

Apparently because Bash has longstanding behavioral inconsistencies around ANSI-C quoting.

Its documentation for ANSI-C quoting does not make any contextual exceptions.

Its documentation for heredocs does not make any special provision that seems relevant:

If [the delimiter word] is unquoted, all lines of the here-document are subjected to parameter expansion, command substitution, and arithmetic expansion, the character sequence \newline is ignored, and ‘\’ must be used to quote the characters ‘\’, ‘$’, and ‘`’.

Inasmuch as I take that last sentence to have an implied "in order to remove their special significance", I see nothing there or in the description of parameter expansion to suggest that parameter expansion works differently in a heredoc than it works elsewhere.

But demonstrably, it does. Specifically, ANSI-C quoted strings are recognized in parameter expansions outside of heredocs, but not in lexically the same parameter expansions inside heredocs. This seems to affect all forms of parameter expansion that involve text other than the parameter name as part of the expansion, and it does not depend on the use of any of the C-style escape sequences. Examples:

$ ex1=abcdef
$ cat <<<${ex1#*$'b'}
cdef
$ cat <<EOF
${ex1#*$'b'}
EOF
abcdef
ex2=a\$bcdef
$ cat <<<${ex2%$'b'*}
a$
$ cat <<EOF
${ex2%$'b'*}
${ex2/$'b'/X}
EOF
a
aXcdef

The actual behavior seems to be that the $ introducing the C-style quote is taken as a literal character, but the single quotes are interpreted according to their ordinary role as quote characters. The simplest workaround seems to be to assign the ANSI-C-quoted text to a variable:

var=$'b'
$ cat <<EOF
${ex1/${var}/X}
EOF
aXcdef

I was first inclined to guess that this behavioral difference is related to ANSI-C quoting not being recognized directly in the body of a heredoc. However, neither is quoting with single- or double-quotes recognized directly in heredocs, yet those quoting forms are still recognized inside parameter expansions embedded in heredocs. I'm having a hard time seeing why it should not be considered a bug that some parameter expansions are interpreted differently inside heredocs than they are outside.

  • Related