Home > OS >  PHP regex extract url with pattern from string
PHP regex extract url with pattern from string

Time:02-21

I have got many topics on extracting all urls from a string and detecting urls with specific pattern. But not both. Sorry I am a bit rough in regex. Can someone please help.

Here is what I want:

$str = <<<EOF
  This string is valid - http://example.com/products/1
  This string is not valid - http://example.com/order/1
EOF;

Basically I want to extract all urls inside the $str variable which has a patter with /products/

I tried this for the url extraction - /\b(?:(?:https?|ftp):\/\/|www\.)[-a-z0-9 &@#\/%?=~_|!:,.;]*[-a-z0-9 &@#\/%=~_|]/i but along with this I only want those having that pattern and not the others.

CodePudding user response:

You can repeat all the allowed characters before and after matching /products/ using the same optional character class. As the character class is quite long, you could shorten the notation by wrapping it in a capture group and recurse the first subpattern as (?1)

Note that you don't have to escape the forward slash using a different separator.

$re = '`\b(?:(?:https?|ftp)://|www\.)([-a-z0-9 &@#/%?=~_|!:,.;]*)/products/(?1)[-a-z0-9 &@#/%=~_|]`';

$str = <<<EOF
  http://example.com/products/1/abc
  This string is valid - http://example.com/products/1
  This string is not valid - http://example.com/order/1
EOF;

preg_match_all($re, $str, $matches);
print_r($matches[0]);

Output

Array
(
    [0] => http://example.com/products/1/abc
    [1] => http://example.com/products/1
)

CodePudding user response:

Beside the answer from "The fourth bird" I am proposing another hybrid solution which is using both regex and classic string operations to provide a helper function with some additional options e.g. to get different results in runtime without changing the RE

<?php

function GetURL($str, $pattern='/products/')
{
    $temp = array();
    preg_match_all('#\bhttps?://[^,\s()<>] (?:\([\w\d] \)|([^,[:punct:]\s]|/))#', $str, $match);
    foreach ($match[0] as $link)
    {
        if(!$pattern)
            array_push($temp, $link);
        else if(strpos($link, $pattern) !== false)
            array_push($temp, $link);
    }
    return $temp;
}

$str = <<<EOF
  This string is valid - http://example.com/products/1
  This string is not valid - http://example.com/order/1
EOF;

print_r(GetURL($str)); //Urls only with /products/ inside
print_r(GetURL($str, '/order/')); //Urls only with /order/ inside
print_r(GetURL($str, false)); //All urls

?>

OUTPUT

Array ( [0] => http://example.com/products/1 ) 
Array ( [0] => http://example.com/order/1 ) 
Array ( 
   [0] => http://example.com/products/1 
   [1] => http://example.com/order/1 
)
  • Related