Home > Blockchain >  Get substring without a specific character in another substring (regex php)
Get substring without a specific character in another substring (regex php)

Time:06-15

I want to create a regex for PHP function "preg_match_all" that will extract exactly the links tags () if the href contains "/go/". I'm pretty close to achieve it but with there 4 examples:


 1. Here is a link <a href="https://test.fr/another-link"></a> and
    another one <a href="https://test.fr/go/url1"></a>.
 2. <a href="https://test.fr/go/url2"></a>
 3. <a  href="https://test.fr/go/url3"></a>
 4. <a href="https://test.fr/go/url4" ></a>

and using:

preg_match_all('/<a.*?[^>].*?href="https:\/\/test.fr\/go\/.*?".*?>.*?<\/a>/is', $input_lines, $output_array);

I am waiting this output:

array(
   0    =>  array(
       0    =>  <a href="https://test.fr/go/url1"></a>
       1    =>  <a href="https://test.fr/go/url2"></a>
       2    =>  <a  href="https://test.fr/go/url3"></a>
       3    =>  <a href="https://test.fr/go/url4" ></a>
  )
)

but for array[0][0] I'm getting

0   =>  <a href="https://test.fr/another-link"></a> and another one <a href="https://test.fr/go/url1"></a>

I tried to exclude any closing tag ">" between the opening of the tag and its href, but without any success.

Any ideas? Thanks

CodePudding user response:

You can use an HTML parser, like https://simplehtmldom.sourceforge.io/docs/1.9/index.html

Then you can do something like

$html = str_get_html($html_string);
foreach($html->find('a') as $anchor) {
    if (strpos($anchor->href, '/go/') !== false) {
        //This anchor is interesting, do something with it
    }
}

CodePudding user response:

I´m not an expert in Regex but I've been testing and this expression seems to work.

/<a href=\"https:\/\/test\.fr\/go\/.*/gm

Hope this help

  • Related