I have a sentence with BBCodes and I would like to replace it with HTML codes:
$sentence = '[html style="font-size: 18px;" dir="ltr"][div style="font-size: 18px;" dir="ltr"][p style="font-weight: bold;"]Hello,[/p][p]You have got a new message from [a href="https://www.example.com/"]Example.com[/a][br][br].You could check your message on [a href="https://www.example.com/en/manager/inbox.html"]Manager[/a][/p][p][img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px"][div style="color: #D4192D; font-weight: bold;"]Example.com Team[/div][/p][/div][/html]';
$htmlTags = '<$1>$2</$3>';
$bbTags = '/\[(.*)\](.*)\[\/(.*)\]/';
$new = preg_replace($bbTags, $htmlTags, $sentence);
echo $new;
The output is:
<html style="font-size: 18px;" dir="ltr"][div style="font-size: 18px;" dir="ltr"][p style="font-weight: bold;"]Hello,[/p][p]You have got a new message from [a href="https://www.example.com/"]Example.com[/a][br][br].You could check your message on [a href="https://www.example.com/en/manager/inbox.html"]Manager[/a][/p][p][img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px"][div style="color: #D4192D; font-weight: bold;"]Example.com Team[/div][/p][/div></html>
So it does not cover the whole sentence.
I do not want to place an array of codes with their replacements
PS: The sentence could be changed, from case to case basis
CodePudding user response:
You can use the following PHP code:
<?php
$sentence = '[html style="font-size: 18px;" dir="ltr"][div style="font-size: 18px;" dir="ltr"][p style="font-weight: bold;"]Hello,[/p][p]You have got a new message from [a href="https://www.example.com/"]Example.com[/a][br][br].You could check your message on [a href="https://www.example.com/en/manager/inbox.html"]Manager[/a][/p][p][img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px"][div style="color: #D4192D; font-weight: bold;"]Example.com Team[/div][/p][/div][/html]';
$rx = '~\[((\w )\b[^]]*)\]((?>(?!\[\2\b).|(?R))*)\[\/\2]~s';
$tmp = '';
while (preg_match($rx, $sentence) && $tmp != $sentence) {
$tmp = $sentence;
$sentence = preg_replace($rx, '<$1>$3</$2>', $sentence);
}
$sentence = preg_replace('~\[([^]]*)]~', '<$1 />', $sentence);
echo $sentence;
Output:
<html style="font-size: 18px;" dir="ltr">
<div style="font-size: 18px;" dir="ltr">
<p style="font-weight: bold;">Hello,</p>
<p>You have got a new message from <a href="https://www.example.com/">Example.com</a><br /><br />.You could check your message on <a href="https://www.example.com/en/manager/inbox.html">Manager</a></p>
<p><img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px" />
<div style="color: #D4192D; font-weight: bold;">Example.com Team</div>
</p>
</div>
</html>
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>
See the regex demo #1 and regex demo #2.
Details:
\[
- a[
char((\w )\b[^]]*)
- Group 1 ($1
): one or more word chars (captured into Group 2), then a word boundary and zero or more chars other than]
char]
- a]
char((?>(?!\[\2\b).|(?R))*)
- Group 3 ($3
): any char that is not a starting point of a[
Group 2 (as a whole word) char sequence, or the whole pattern recursed\[\/\2]
-[/
string, Group 2 value,]
char.
This is the pattern that handled paired tags. The second pattern handles non-paired tags:
\[
- a[
char([^]]*)
- Group 1 ($1
): any zero or more chars other than]
]
- a]
char.
CodePudding user response:
Obviously, it's not possible to do it in one pass because you have to deal with nested tags and a pattern can't match several times the same substrings.
A solution consists to start the replacement with the innermost tags (tags without other bracketed tags inside). To do that you don't need a recursive pattern but only to forbid opening brackets when you describe the text contents.
$sentence = '[html style="font-size: 18px;" dir="ltr"][div style="font-size: 18px;" dir="ltr"][p style="font-weight: bold;"]Hello,[/p][p]You have got a new message from [a href="https://www.example.com/"]Example.com[/a][br][br].You could check your message on [a href="https://www.example.com/en/manager/inbox.html"]Manager[/a][/p][p][img src="https://www.example.com/assets/images/logo-default-120x50.png" width="120px" height="80px"][div style="color: #D4192D; font-weight: bold;"]Example.com Team[/div][/p][/div][/html]';
// proceed to the replacement of all self-closing tags first
$result = preg_replace('~\[ (br|hr|img)\b ([^]]*) ]~xi', '<$1$2/>', $sentence);
// then replace the innermost tags until there's nothing to replace
$count = 0;
do {
$result = preg_replace('~
\[ ( (\w ) [^]]* ) ] # opening tag
( [^[]* ) # content without other bracketed tags
\[/ \2 ] # closing tag
~xi', '<$1>$3</$2>', $result, -1, $count);
} while ($count);
echo $result;
The 5th parameter of preg_replace
is a variable reference in which the number of replacements is stored ($count
here). This variable is used as a condition to stop the do...while
loop. (When $count==0
there's no more things to replace).