Home > other >  preg match with html tag mailto
preg match with html tag mailto

Time:08-26

I have below code and I want to catch a value that starts with ** Data Contact: ** because I need find email addresses in text.

My code worked when it didin't have html but I don't know how to change regex in preg match which it will be work in html tag.


$text = 'Some text  Some text 

<a href="mailto:[email protected]">[email protected]</a><br />

Data Contact: <a href="mailto:[email protected]">[email protected]</a>, <a href="mailto:[email protected]">[email protected]</a><br />
<a href="mailto:[email protected]">[email protected]</a><br />

Some text  Some text  Some text';

preg_match_all("/Data Contact:  [\._a-zA-Z0-9-] @[\._a-zA-Z0-9-] /i", $text, $matches);

foreach($matches[0] as $val){
    
    echo  str_replace("Data Contact:", "",$val);
}

CodePudding user response:

I could see two possible approaches to this issue:

<?php
$text = 'Some text  Some text 

Data Contact: <a href="mailto:[email protected]">[email protected]</a>, <a href="mailto:[email protected]">[email protected]</a><br />
<a href="mailto:[email protected]">[email protected]</a><br />

Some text  Some text  Some text';
preg_match_all("/Data Contact:  \K[-.\w] @[-.\w] /i", strip_tags($text), $matches);
foreach($matches[0] as $val){
    echo $val;
}

which seems to match the question's description. We strip HTML context then pull the email after Data Contact:.

Alternatively an HTML parser could be used to pull each link with a mailto: context, this matches the question's title:

$text = 'Some text  Some text 

Data Contact: <a href="mailto:[email protected]">[email protected]</a>, <a href="mailto:[email protected]">[email protected]</a><br />
<a href="mailto:[email protected]">[email protected]</a><br />

Some text  Some text  Some text';
$dom = new DOMDocument;
$dom->loadHTML($text);
$links = $dom->getElementsByTagName('a');
foreach($links as $link){
    $href = $link->getAttribute('href');
    if(strpos($href, 'mailto:') !== FALSE){
        echo str_replace('mailto:', '', $href);
    }
}

Update, for updated requirement:

<?php
$text = 'Some text  Some text 

Data Contact: <a href="mailto:[email protected]">[email protected]</a>, <a href="mailto:[email protected]">[email protected]</a><br />
<a href="mailto:[email protected]">[email protected]</a><br />

Some text  Some text  Some text';
$emails = preg_replace_callback("/.*Data Contact:  .*/is", function($match){
    preg_match_all('/mailto:\K[-.\w] @[-.\w] /', $match[0], $matches);
    $emails = '';
    foreach($matches[0] as $email){
        $emails .= $email . PHP_EOL;
    }
    return $emails;
}, $text);
echo $emails;

Find the Data Contact: first, then pull each mailto: with an email matching value.

  • Related