I want to create a regex for PHP function "preg_match_all" that will extract exactly the links tags () if the href contains "/go/". I'm pretty close to achieve it but with there 4 examples:
1. Here is a link <a href="https://test.fr/another-link"></a> and
another one <a href="https://test.fr/go/url1"></a>.
2. <a href="https://test.fr/go/url2"></a>
3. <a href="https://test.fr/go/url3"></a>
4. <a href="https://test.fr/go/url4" ></a>
and using:
preg_match_all('/<a.*?[^>].*?href="https:\/\/test.fr\/go\/.*?".*?>.*?<\/a>/is', $input_lines, $output_array);
I am waiting this output:
array(
0 => array(
0 => <a href="https://test.fr/go/url1"></a>
1 => <a href="https://test.fr/go/url2"></a>
2 => <a href="https://test.fr/go/url3"></a>
3 => <a href="https://test.fr/go/url4" ></a>
)
)
but for array[0][0] I'm getting
0 => <a href="https://test.fr/another-link"></a> and another one <a href="https://test.fr/go/url1"></a>
I tried to exclude any closing tag ">" between the opening of the tag and its href, but without any success.
Any ideas? Thanks
CodePudding user response:
You can use an HTML parser, like https://simplehtmldom.sourceforge.io/docs/1.9/index.html
Then you can do something like
$html = str_get_html($html_string);
foreach($html->find('a') as $anchor) {
if (strpos($anchor->href, '/go/') !== false) {
//This anchor is interesting, do something with it
}
}
CodePudding user response:
I´m not an expert in Regex but I've been testing and this expression seems to work.
/<a href=\"https:\/\/test\.fr\/go\/.*/gm
Hope this help