I want to get text between < H2 > tags . The code I wrote was
grep -oP "<H2>.*?</H2>" redirect.html
The answer I get is
<H2><A NAME="s3">3. All about redirection</A> </H2>
I want to remove the < H2 > and < /H2 > by changing the regular expression there.
CodePudding user response:
Try using this,
grep -oP "(?<=<H2>).*?(?= </H2>)" redirect.html
CodePudding user response:
An alternative with a single lookahead, using \K
to clear the current match buffer (forget what is matched so far) and assert 1 or more horizontal whitespace characters to the right of the current position:
grep -oP "<H2>\K.*?(?=\h </H2>)" redirect.html