I want to use regex to capture color, animal, and country from the following html. However, with country, there is a possibility that a <br>
tag exists before the country name, such as with SPAIN in my example. I want to omit that <br>
tag, so that only "SPAIN" is captured.
<p><span >RED</span><br><span >DOG</span>USA</p>
<p><span >GREEN</span><br><span >CAT</span><br>SPAIN</p>
<p><span >BLUE</span><br><span >MOUSE</span>FRANCE</p>
I have the following regex, but it doesn't omit the country <br>
tag:
/<p><span >(.*)<\/span><br><span >(.*)<\/span>(.*)<\/p>/
Please help.
CodePudding user response:
Try this:
<p><span >(.*)<\/span><br><span >(.*)<\/span>(?:<br>)?(.*)<\/p>
(?:...)
: non-capturing group.
?
: 0 or 1 times
check pattern: Regex101
CodePudding user response:
You can try this to match only the content between >
and <
(?<=>)([[:upper:]] )(?=<)