Home > Blockchain >  Automatically detect URL links in a text
Automatically detect URL links in a text

Time:03-06

I am creating a job announcement website.

Currently, I am stuck in dealing with URLs included in the content of a job announcement.

I can perfectly display the said content from my database to my website but URLs are not detected.

In Facebook, when you post something like that, the site automatically detect these links. I want also to achieve this in my own website.

job announcement content picture

CodePudding user response:

In A Liberal, Accurate Regex Pattern for Matching URLs I found the following Regex

\b(([\w-] ://?|www[.])[^\s()<>] (?:([\w\d] )|([^[:punct:]\s]|/)))

Solution

/**
 * @param string $str the string to encode and parse for URLs
 */
function preventXssAndParseAnchors(string $str): string
{
    $url_regex = "/\b((https?:\/\/?|www\.)[^\s()<>] (?:\([\w\d] \)|([^[:punct:]\s]|\/)))/";

    // Encoding HTML special characters To prevent XSS
    // Before parsing the URLs to Anchors
    $str = htmlspecialchars($str, ENT_QUOTES, 'UTF-8');

    preg_match_all($url_regex, $str, $urls);

    foreach ($urls[0] as $url) {
        $str = str_replace($url, "<a href='$url'>$url</a>", $str);
    }

    return $str;
}

Example

<?php
$str = "
apply here https://ph.dbsd.com/job/dfvdfg/5444

<script> console.log('this is a hacking attempt hacking'); </script>

and www.google.com

also http://somesite.net
";

echo preventXssAndParseAnchors($str);

The output

apply here <a href='https://ph.dbsd.com/job/dfvdfg/5444'>https://ph.dbsd.com/job/dfvdfg/5444</a>

&lt;script&gt; console.log(&#039;this is a hacking attempt hacking&#039;); &lt;/script&gt;

and <a href='www.google.com'>www.google.com</a>

also <a href='http://somesite.net'>http://somesite.net</a>

Test https://3v4l.org/85lsl

  • Related