Home > Mobile >  Word search Replace and Limit specific Character using regex php
Word search Replace and Limit specific Character using regex php

Time:09-22

I want to replace specific words between <loc> and </loc> then limit the word to Specific number.

<?php
    $string = '<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
    <url>
    <loc>https://subdomain.example.com</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url>
    <url>
    <loc>https://subdomain.example.com/s/queen-katwe-2016-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url><url>
    <loc>https://subdomain.example.com/s/justice-league-dark-2017-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url><url>
    <loc>https://subdomain.example.com/s/edge-seventeen-2016-720p-hd-480p-hd/</loc>
    <priority>1.0</priority>
    <changefreq>always</changefreq>
    </url></urlset>';
    
    $search = "/(<loc>)(.*?)(<\/loc>)/";
    $replace =  mb_strimwidth('$2', 0, 15);
    $total = preg_replace($search,$replace,$string);
    echo $total;
?>

I have tried and its not working... please kindly help me out, thank you in advance

CodePudding user response:

You have XML which is more than just a string, and I would recommend using tools that are aware of XML itself such as DOMDocument. I don't know what specific logic you are trying to do, and I didn't know that mb_strimwidth existed even, but this could be written as:

$xml = <<<EOT
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.sitemaps.org/schemas/sitemap/0.9 http://www.sitemaps.org/schemas/sitemap/0.9/sitemap.xsd">
<url>
<loc>https://subdomain.example.com</loc>
<priority>1.0</priority>
<changefreq>always</changefreq>
</url>
<url>
<loc>https://subdomain.example.com/s/queen-katwe-2016-720p-hd-480p-hd/</loc>
<priority>1.0</priority>
<changefreq>always</changefreq>
</url><url>
<loc>https://subdomain.example.com/s/justice-league-dark-2017-720p-hd-480p-hd/</loc>
<priority>1.0</priority>
<changefreq>always</changefreq>
</url><url>
<loc>https://subdomain.example.com/s/edge-seventeen-2016-720p-hd-480p-hd/</loc>
<priority>1.0</priority>
<changefreq>always</changefreq>
</url></urlset>
EOT;

$dom = new DOMDocument;
$dom->loadXML($xml);

foreach($dom->getElementsByTagName('loc') as $node) {
    if ((XML_ELEMENT_NODE === $node->nodeType) && ('loc' === $node->nodeName)){
        $node->nodeValue = mb_strimwidth($node->nodeValue, 0, 15);
    }
}

echo $dom->saveHTML();

Demo here: https://3v4l.org/fvS02

Note: You appear to be doing something with the URL. Once again, a URL is more than just a string and PHP has parse_url for parsing URLs which I'd encourage you to use, if that is indeed what you are doing.

EDIT

If your source data isn't XML, I'd still use a parser if possible. DOMDocument supports HTML, too, you just need to suppress some warnings because HTML isn't usually as strict.

But if your data doesn't have a parser, then it might be better to use RegEx. For this I think I'd want to use a callback function to determine the logic for what to replace with.

$xml = <<<EOT
<loc>https://subdomain.example.com</loc>
<loc>https://subdomain.example.com/s/queen-katwe-2016-720p-hd-480p-hd/</loc>
<loc>https://subdomain.example.com/s/justice-league-dark-2017-720p-hd-480p-hd/</loc>
<loc>https://subdomain.example.com/s/edge-seventeen-2016-720p-hd-480p-hd/</loc>
EOT;

var_dump(
    preg_replace_callback(
        '/<loc>(?<value>[^<] )<\/loc>/',
        static function($matches) {
            return sprintf('<loc>%1$s</loc>', mb_strimwidth($matches['value'], 0, 15));
        },
        $xml
    )
);

Demo: https://3v4l.org/OhmtZ

  • Related