Home > Software engineering >  sed matching "$" literally without considering it regex
sed matching "$" literally without considering it regex

Time:08-19

I was trying to use $ in the sed -e command and it works , eg:

sed -e 's/world$/test/g' test.txt

the above command will replace "world" at the end of string.

what confused me the following worked literally :

sed -e 's/${projects.version}/20.0/g' test.txt

the above command replaced ${projects.version}, I don't have any explanation how did the sed match the $ and didn't expect it to be a special character?

CodePudding user response:

As the POSIX spec says:

$ The shall be special when used as an anchor.

A ( '$' ) shall be an anchor when used as the last character of an entire BRE. The implementation may treat a as an anchor when used as the last character of a subexpression. The shall anchor the expression (or optionally subexpression) to the end of the string being matched; the can be said to match the end-of-string following the last character.

so when it's not at the end of a BRE, it's just a literal $ character. Actually, you could argue it's undefined behavior since the effect of having a $ at some other place than the end of the regexp isn't actually specified.

For EREs the 2nd paragraph is a little different:

A ( '$' ) outside a bracket expression shall anchor the expression or subexpression it ends to the end of a string; such an expression or subexpression can match only a sequence ending at the last character of a string. For example, the EREs "ef$" and "(ef$)" match "ef" in the string "abcdef", but fail to match in the string "cdefab", and the ERE "e$f" is valid, but can never match because the 'f' prevents the expression "e$" from matching ending at the last character.

Note that last sentence - that means the $ is NOT treated literally in an ERE when not at the end of a regexp, it just can't match anything.

This is something you should never have to worry about, though, because for clarity if nothing else, you should always make sure you write your regexps to escape any regexp metachar you want treated literally so you shouldn't write:

's/$foo/bar/'

but write either of these instead:

's/\$foo/bar/'
's/[$]foo/bar/'

and then none of the semantics mentioned above matter.

  • Related