I want to use a regex to capture two substrings into two separate capture groups. The delimiter between Country and City is " ■" (a space followed by a black square). If the delimiter doesn't exist, then it means there is no city, in which case it should capture a blank value. Here is the text:
<p>USA</p>
<p>SPAIN ■Madrid</p>
<p>FRANCE</p>
I have the following regex, which captures everything between the <p>
tags:
/<p>(. )<\/p>/
How can I capture Country and City separately (or blank city if no delimiter)?
CodePudding user response:
Try:
/<p>(. ?)(?:\s■(. ))?<\/p>/
(. ?)
is the first capture group. The ?
makes it lazy to not interfere with the whitespace.
The second group is a non-capture group due to the ?:
and it contains the second capture group (. )
The ?
after the non-capture group makes the whole construct optional in case there is no city.