If I want to replace several lines, for example in a file or in STDIN, and I don't know the numbers of the lines that occur in the file or in STDIN, I can turn the whole flow into one line, for example with tr
, like this:
$ printf "%s\n" aaa bbb ccc ddd | tr '\n' '\0' | sed -e 's#bbb\x0ccc\x0ddd#string2\x0string3\x0string4#g' | tr '\0' '\n'
aaa
bbb
ccc
ddd
I want to get that conclusion in this case:
aaa
string2
string3
string4
Note that this is a test example, in the real case I do not know the numbers of the lines in which to make the substitution. I only know the rows that need to be replaced and the rows that need to be replaced.
As far as I can see, sed
can replace NULL-characters, example:
printf "%s\n" aaa bbb ccc ddd | tr '\n' '\0' | sed -e 's#\x0#\n#g'
aaa
bbb
ccc
ddd
Why doesn't it happen in the first case?
You can try to replace it with a regular expression - (.*)
instead of \x0
, but with different input data, it will make the substitution wrong, as in the example below:
$printf "%s\n" aaa bbb ccc ddd bbb ddd | tr '\n' '\0' | sed -e 's#bbb\(.*\)ccc\(.*\)ddd#string2\1string3\2string4#g' | tr '\0' '\n'
aaa
string2
string3
ddd
bbb
string4
Can you please tell me how to correctly replace multiple lines? Thank you for your help!
CodePudding user response:
The problem seems to be that the \x
escapes consumes more than just the 1 zero.
Consider that in \x0c
, both 0
and c
are valid hexadecimal digits.
The hex escapes work differently depending on language.
E.g., in C they're super greedy (will consume all valid hex digits that they can).
A saner \x
escape for non-wide strings would consume exactly two digits (so as to fill an 8-bit byte). Sed's version seems to work like that.
Experimentally, replacing \x0
with \x00
works:
printf "%s\n" aaa bbb ccc ddd | tr '\n' '\0' | sed -e 's#bbb\x00ccc\x00ddd#string2\x00string3\x00string4#g' | tr '\0' '\n'