Home > Enterprise >  Grep Match Phones Surrounded by Text
Grep Match Phones Surrounded by Text

Time:11-01

I am trying to locate all phone numbers across various files, including JSON and TXT.

Matching should be done based on whether there are 10 or 11 numeric characters (0-012-345-6789) or (012-345-6789), NOT more and NOT less. The phone numbers are often surrounded by text, but sometimes by spaces and tabs (see below examples). The phone numbers sometimes also include hyphens "-" and parentheses "()" to delineate the numbers.

abc0123456789def <- match
abc10123456789def <- match
abc10123456789def <- match
abc101234567899def <- no match (12 numbers)
abc101234567def <- no match (9 numbers)

abc 0123456789 def <- match
abc 10123456789 def <- match

abc1(012)345-6789def <- match
abc1-012-345-6789def <- match
abc(012)345-6789def <- match
abc012-345-6789def <- match
abc 1(012)345-6789 def <- match

Your help is super appreciated!

CodePudding user response:

If I recall grep correctly then:

grep -iP "(?:^|(?<=\D))\d?(?:\(\d{3}\)|-?\d{3})-?\d{3}-?\d{4}(?=\D|$)"
  • (?:^|(?<=\D)) - behind me is the start of the line or a non-digit char
  • \d? - optional leading digit
  • (?: - start non-capturing group
    • \(\d{3}\) - format equivalent to (555)
    • | - or
    • -?\d{3} - format equivalent to -555 with the hyphen being optional
  • ) - end non-capturing group
  • -?\d{3}-?\d{4} - format equivalent to -555-5555 with optional hyphens
  • (?=\D|$) - ahead of me is a non-digit char or the end of a line

Here it is in PHP https://regex101.com/r/Gdeiq7/1

  • Related