I need to parse stdin in the following way:
(1) all newlines characters must be substituted with \n
(a literal \
followed by n
)
(2) nothing else should be performed except the previous
I chose awk
to do it, and I would like an answer that uses awk
if possible.
I came up with:
echo -ne "A\nB\nC" | awk '{a[NR]=$0;} END{for(i=1;i<NR;i ){printf "%s\\n",a[i];};printf "%s",a[NR];}'
But it looks cumbersome.
Is there a better / cleaner way?
CodePudding user response:
With awk:
echo -ne "A\nB\nC" | awk 'BEGIN{FS="\n"; OFS="\\n"; RS=ORS=""} {$1=$1}1'
Output:
A\nB\nC
See: 8 Powerful Awk Built-in Variables – FS, OFS, RS, ORS, NR, NF, FILENAME, FNR
CodePudding user response:
Handling malformed files (ie. that don't end with the record separator) with
awk
is tricky.sed -z
is GNU specific, and has the side effect of slurping the whole (text) file into RAM (that might be an issue for huge files)
Thus, for a robust and reasonably portable solution I would use perl
:
perl -pe 's/\n/\\n/'
CodePudding user response:
Using GNU awk for multi-char RS:
$ echo -ne "A\nB\n\nC" | awk -v RS='^$' -v ORS= -F'\n' -v OFS='\\n' '{$1=$1} 1'
A\nB\n\nC$
You need to use GNU awk for this as no other awk will tell you if the input ended with \n
or not and so no other awk
CodePudding user response:
I would harness GNU AWK
for this task following way
echo -ne "A\nB\nC" | awk '{printf "%s%s",$0,RT?"\\n":""}'
gives output
A\nB\nC
(without trailing newline)
Explanation: I do create string to be output based on current line context ($0
) and backslash followed by n
or empty string depending on RT
which is row terminator for current line. RT
value is newline for all but last lines and empty string for last line, therefore when used in boolean context it is true for all but last line. I used so-called ternary operator here condition?
valueiftrue:
valueiffalse.
(tested in GNU Awk 5.0.1)