I have some html pages with different phone numbers on it. Example:
<p style="text-align: center;">(xxx) xxxx xxxx</p>
<span style="text-align: center;">xxxxxxxxxx</span>
<li style="text-align: center;">(xxx) x xxx xxxx</li>
<p style="text-align: left;">xxxxx xxxx</p>
I would like to know the best way to change or even remove them using php.
My main idea would be using xpath with regex to find the text, but I believe regex doesn't work with xpath.
CodePudding user response:
I'm not familiar with XPATH but i find a nice article that can help you to Use PHP Functions in XPath Expressions.
You need to create a function that she do stuff : preg_match_all or preg_match or preg_replace.
after write you variable which contains html code :
$YourHtmlCode = <<<HTML
'<p style="text-align: center;">(xxx) xxxx xxxx</p>
<span style="text-align: center;">xxxxxxxxxx</span>
<li style="text-align: center;">(xxx) x xxx xxxx</li>
<p style="text-align: left;">xxxxx xxxx</p>';
HTML;
Convert your html text to DOM Document like :
$dom = new DOMDocument;
$dom->loadHTML($YourHtmlCode, LIBXML_HTML_NOIMPLIED|LIBXML_HTML_NODEFDTD);
After use registerPHPFunctions to call the function as above.
I use (?<=>)(.*?)(?=<) to match all elements between > and < operator. Example
You can do like this to get all parts.
<?php
$reg = '/(?<=\>)(.*?)(?=\<)/m';
$str = '<p style="text-align: center;">(xxx) xxxx xxxx</p>
<span style="text-align: center;">xxxxxxxxxx</span>
<li style="text-align: center;">(xxx) x xxx xxxx</li>
<p style="text-align: left;">xxxxx xxxx</p>';
preg_match_all($reg, $str, $matches, PREG_SET_ORDER);
foreach ($matches as $val) {
echo "matched: " . $val[0] . "\n";
}
?>
After you can do your modification in the value directly.
If you want to replace directly the value with regex, you can use preg_replace.
For example :
<?php
$reg = '/(?<=\>)(.*?)(?=\<)/m';
$str = '<p style="text-align: center;">(xxx) xxxx xxxx</p>
<span style="text-align: center;">xxxxxxxxxx</span>
<li style="text-align: center;">(xxx) x xxx xxxx</li>
<p style="text-align: left;">xxxxx xxxx</p>';
echo preg_replace($reg, "ReplaceString", $str);
?>
CodePudding user response:
An example using regular expressions. The surrounding tags are also removed.
((\ |\d|\(|(<.*?>))[\d\-\(\)\. ]{9,}(\.|\n| |<\/.*>)(?!(png|jpg|<)))