Home > Software design >  How to escape regex in sed replace
How to escape regex in sed replace

Time:11-05

I want to replace text in a file. My regex is [\s\S\n]*<h1 class='test'>. I tried following commands, but the text not replaced.

  1. sed -i.bak 's/[\s\S\n]*<h1 class='test'>//g' 36
  2. sed -i.bak 's/\[\\s\\S\\n\]\*<h1 class=\x27test\x27>//g' 36

File name is 36

Output of grep "[\s\S\n]*<h1 class='test'>" -q 36 && echo "FOUND" || echo "NOTFOUND" is FOUND.

CodePudding user response:

sed by default only operates on a line-by-line basis.

To match across lines - as it appears you are using GNU sed - you need to use -z option (it will slurp the file contents and sed will be able to "see" line breaks) and then use . to match any char (in POSIX regex, . matches even line breaks). Note [\s\S] is a "corrupt" POSIX pattern, as inside POSIX bracket expressions, PCRE-like shorthand character classes are parsed as combinations of a backslash and a char next to it (i.e. [\s] matches a \ or s).

Another issue is that you used single quotation marks inside single quoted string, which is wrong (they got stripped in the end and your pattern had no ' in it).

So, with GNU sed use

sed -i.bak -z "s/.*<h1 class='test'>//g" 36

With a non-GNU sed, you could use techinques described here.

  • Related