I'm trying to trim space at the start and end of a document without touching intermediate space in a file using perl inside a bash script
The file has the following format
<newline>
<space><newline>
<tab><newline>
<space><tab><newline>
START<newline><newline>
<space>INDENTED<newline><newline>
END<newline>
<space><tab><newline>
<tab><newline>
<space><newline>
<newline>
NOTE: <newline>
is \n
, <space>
is
& <tab>
is \t
So the original file looks like
START
INDENTED
END
I need the content of the file to be
START<newline><newline>
<space>INDENTED<newline><newline>
END
i.e final file like this
START
INDENTED
END
I tried using both of them the following command, but it trims intermediate space aswell. Both of them trim space & newlines from the whole document rather than just from start the start of the document
perl -pi -e 's/^\s*//gs' sample.txt
perl -pi -e 's/\A\s*//gs' sample.txt
Both collapsed all internal space
START<newline>
INDENTED<newline>
END<newline>
I tried this. It collapsed newlines
perl -pi -e 's/\s*$//gs' sample.txt
perl -pi -e 's/\s*\Z//gs' sample.txt
Both collapsed newlines
START<space>INDENTEDEND<newline>
Here are my assumptions
\A
matches just the start of the document &\Z
matches end of document (as opposed^
&$
)s
in thegs
flag ensures the whole document is treated as single line with newlines replaced with character\n
I am new to perl. Appreciate if someone can help me understand where I went wrong
CodePudding user response:
You may use this perl
in slurp
mode:
perl -0777 -pe 's/^\s |(\R)\s $/$1/g' file
Output:
START
INDENTED
END
Details:
-0777
Enables slurp mode to makeperl
read full file^\s
Match 1 whitespaces at the start of file(\R)\s $
: Match a line break followed by 1 whitespaces at the end- We use
$1
in replacement to put line break back otherwise you will get file content without ending line break
CodePudding user response:
Not perl
, but ed
is useful for editing files:
$ printf '%s\n' '1,/START/-1d' '/END/ 1,$d' w | ed -s sample.txt
$ cat sample.txt
START
INDENTED
END
This deletes everything in the ranges of lines from the first to the line before the one matching START
, and from the line after END
to the end of the file, and then writes the changed file back to disk.
Or a similar perl
approach, which only prints lines in the range you want to keep:
perl -i -ne 'print if /START/../END/' sample.txt
CodePudding user response:
Here is a short sed version:
sed -n '/START/,/END/p'
or with the negated logic:
sed '1,/START/{/START/!d}; /END/,${/END/!d}'