Home > Software design >  How to use SED to remove texts starting from the first pattern until the second occurrence of the en
How to use SED to remove texts starting from the first pattern until the second occurrence of the en

Time:02-28

I am trying to use sed to remove texts starting from the line with the first pattern until the line with the second occurrence of the end pattern.

For example, the first pattern is BEGIN, and the end pattern is BREAKPOINT.

Input Example:

qq
ww
BEGIN
ab
ef
BREAKPOINT
ij
mn
BREAKPOINT
kk
BREAKPOINT
zz
yy
xx
BEGIN
ab
ef
BREAKPOINT
ij
mn
BREAKPOINT
pp

Expected Output:

qq
ww
kk
BREAKPOINT
zz
yy
xx
pp

Could you please kindly tell me how, or maybe an alternative way to achieve this?

PS. I know only how to show texts between the first pattern until the second occurrence of the end pattern, but I do not know how to remove them.

sed -n '/BEGIN/{:a;N;s/BREAKPOINT/&/2p;Ta}' file

Thanks.

CodePudding user response:

It's almost certainly not impossible to do this in sed, but also almost certainly somewhere between hard and impossible to understand the solution some weeks from now unless you are extremely familiar with sed. Better then to switch to a more human-readable scripting language:

awk '/^BEGIN$/ && !s { s=2 }
!s
/^BREAKPOINT$/ && s { s-- }' file

In very brief, the variable s keeps track of whether we have seen the beginning separator and if so how many times we want to see the terminating separator before reverting to printing the input lines.

CodePudding user response:

A perl:

perl -0777 -pe 's/^BEGIN[\s\S]*?^BREAKPOINT[\s\S]*?^BREAKPOINT\R//gm' file

Or,

perl -0777 -pe 's/^BEGIN(?:[\s\S]*?^BREAKPOINT\R){2}//gm' file
  • Related