I have the following Regex in my PHP code:
// markers for italic set *Text*
if (substr_count($line, '*')>=2)
{
$line = preg_replace('#\*{1}(.*?)\*{1}#', '<i>$1</i>', $line);
}
which works great.
However, when a $line
holds a <br>
, e.g.
*This is my text<br>* Some other text
Then the regex still considers the text and transforms it to:
<i>This is my text<br></i> Some other text
The goal is to not translate the text if a <br>
is encountered. How to do that with a Regex - using a so called "negative lookahead" or how can the existing Regex be changed?
Note: Strings like *This is my text*<br>Some other text<br>And again *italic*<br>END
should still be considered and transformed.
Idea: Or should I explode the $line
and then iterate over the results with the regex?!
CodePudding user response:
Using match-what-you-don't-want and discard technique, you may use this regex in PHP (PCRE):
\*[^*]*<br>\*(*SKIP)(*F)|\*([^*]*)\*
and replace with <i>$1</i>
PHP code:
$r = preg_replace('/\*[^*]*<br>\*(*SKIP)(*F)|\*([^*]*)\*/'),
"<i>$1</i>", $input);
Explanation:
\*
: Match a*
[^*]*
: Match 0 or more non-*
characters<br>
: Match<br>
\*
: Match closing*
(*SKIP)(*F)
: PCRE verbs to discard and skip this match|
: OR\*([^*]*)\*
: Match string enclosed by*
s
CodePudding user response:
You can replace matches of the regular expression
\*(?:(?!<br>)[^*]) \*
with
'<i>$0</i>'
where $0
holds the matched string.
The regular expression can be broken down as follows.
\* # match '*'
(?: # begin a non-capture group
(?!<br>) # negative lookahead asserts that next four chars are not '<br>'
[^*] # match any char other than '*'
) # end non-capture group and execute one or more times
\* # match '*'