I wish to replace hn
tags that does not contain class attribute. The idea is to match anything that follow hn
except a string contain a and ends with
>
This is my first attempt:
<?php
$content = <<<HTML
<h1 style="color:black">test1</h1>
<H2 >test2</H2>
<h5 >test</h5>
<h5 >test test</h5>
HTML;
$content = preg_replace('#<h([1-6])((?!class).)*?>(.*?)<\/h[1-6]>#si', '<p ${2}>${3}</p>', $content);
echo ($content);
The result is:
<p ">test1</p>
<H2 >test2</H2>
<h5 >test</h5>
<h5 >test test</h5>
It should be:
<p style="color:black">test1</p>
<H2 >test2</H2>
<h5 >test</h5>
<h5 >test test</h5>
Any idea why $2 map to "
value instead of style="color:black"
CodePudding user response:
Your capturing group must be added in a bit different place.
Replace ((?!class).)*?
with ((?:(?!class).)*?)
.
Use
'#<h([1-6])\s*((?:(?!class).)*?)>(.*?)</h[1-6]>#si'
See regex proof.
EXPLANATION
--------------------------------------------------------------------------------
<h '<h'
--------------------------------------------------------------------------------
( group and capture to \1:
--------------------------------------------------------------------------------
[1-6] any character of: '1' to '6'
--------------------------------------------------------------------------------
) end of \1
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
( group and capture to \2:
--------------------------------------------------------------------------------
(?: group, but do not capture (0 or more
times (matching the least amount
possible)):
--------------------------------------------------------------------------------
(?! look ahead to see if there is not:
--------------------------------------------------------------------------------
class 'class'
--------------------------------------------------------------------------------
) end of look-ahead
--------------------------------------------------------------------------------
. any character except \n
--------------------------------------------------------------------------------
)*? end of grouping
--------------------------------------------------------------------------------
) end of \2
--------------------------------------------------------------------------------
> '>'
--------------------------------------------------------------------------------
( group and capture to \3:
--------------------------------------------------------------------------------
.*? any character except \n (0 or more times
(matching the least amount possible))
--------------------------------------------------------------------------------
) end of \3
--------------------------------------------------------------------------------
</h '</h'
--------------------------------------------------------------------------------
[1-6] any character of: '1' to '6'
--------------------------------------------------------------------------------
> '>'