Home > Software engineering >  Regex to match and extract North American phone numbers
Regex to match and extract North American phone numbers

Time:11-30

I need to match and extract phone numbers from text ... phone numbers that are in this format:

589-845-2889

(589)-845-2889

589.845.2889

589 845 2889

5898452889

(589) 845 2889

The following operation matches the above with great accuracy:

preg_match_all("/(\()?\d{3}(?(1)\))[-. ]?\d{3}[-. ]?\d{4}/", $test, $result);

So I need to extend it to support international prefix, such as 1 or 1. Hence, it should also match 1(589) 845 2889, 1(589) 845 2889, 1589 845 2889, 1 589 845 2889, 15898452889 , etc.

Any help would be greatly appreciated.

CodePudding user response:

For the example data, you might use:

^(?:\ ?1\h?)?(?:(\()?\d{3}(?(1)\))([-. ]?)\d{3}\2\d{4}|\d{11})$

The pattern matches:

  • ^ Start of string
  • (?:\ ?1\h?)? Optionally match 1 optionally prefixed by or followed by a space
  • (?: Non capture group
    • (\()?\d{3}(?(1)\)) Match 3 digits between parenthesis or not
    • ([-. ]?) Match optional - . in group 2
    • \d{3} Match 3 digits
    • \2 Backreference to group 2 to keep the delimiters the same
    • \d{4} Match 4 digits
    • | Or
    • \d{11} Match 11 digits
  • ) Close non capture group
  • $ End of string

Regex demo | Php demo

Example

$re = '/^(?:\ ?1\h?)?(?:(\()?\d{3}(?(1)\))([-. ]?)\d{3}\2\d{4}|\d{11})$/m';
$str = '589-845-2889
(589)-845-2889
589.845.2889
589 845 2889
5898452889
(589) 845 2889
1(589) 845 2889
 1(589) 845 2889
1589 845 2889
1 589 845 2889
15898452889
 1207 244 7002
 1 207 244 7002';

preg_match_all($re, $str, $matches);
print_r($matches[0]);

Output

Array
(
    [0] => 589-845-2889
    [1] => (589)-845-2889
    [2] => 589.845.2889
    [3] => 589 845 2889
    [4] => 5898452889
    [5] => (589) 845 2889
    [6] => 1(589) 845 2889
    [7] =>  1(589) 845 2889
    [8] => 1589 845 2889
    [9] => 1 589 845 2889
    [10] => 15898452889
    [11] =>  1207 244 7002
    [12] =>  1 207 244 7002
)

CodePudding user response:

Alternative approach:

You can convert all different types to 1 common syntax delimited by say .. Now, you can try for 2 checks.

  • If all are digits and the length is 10 or 11.
  • OR if it follows the format of 1.207.333.4444 or 207.333.4444 or 1(207).333.4444 or (207).333.4444.

Snippet:

<?php

function isValid($str){
    $str = preg_replace('/[.\-\s]/','.',$str);
    return preg_match('/^\d{10,11}$/', $str) === 1 || 
           preg_match('/^((\ ?\d)?\.?(\d{3}|\(\d{3}\)))\.(\d{3})\.(\d{4})$/', $str) === 1;
}

Online Demo

Suggestion:

Instead of giving a simple textbox on the UI, give 3-4 blocks delimited by some text, say - or .. This way, processing them on the backend or adding javascript validation becomes easy.

.input_ph{
 width:35px;
}
<html>
<body>
<span> </span><input type="textbox" class="input_ph" />&nbsp;&nbsp;&nbsp;
<span>-</span>&nbsp;&nbsp;&nbsp;
<span>(</span><input type="textbox" class="input_ph" />
<span>)</span>&nbsp;&nbsp;&nbsp;
<span>-</span>&nbsp;&nbsp;&nbsp;<input type="textbox" class="input_ph" />
&nbsp;&nbsp;&nbsp;
<span>-</span>&nbsp;&nbsp;&nbsp;<input type="textbox" class="input_ph" />
&nbsp;&nbsp;&nbsp;
</body>
</html>
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>

  • Related