Home > other >  Reject optional whitespaces after area code and before local number (US)
Reject optional whitespaces after area code and before local number (US)

Time:09-17

I have a regex that parses US phone numbers into 3 strings.

import re
s = '  916-2221111 ' # this also works'(916) 222-1111   '

reg_ph = re.match(r'^\s*\(?(\d{3})\)?-? *(\d{3})-? *-?(\d{4})', s)
if reg_ph:
    return reg_ph.groups()

else:
    raise ValueError ('not a valid phone number')

it works perfectly on the numbers:

'(916) 222-1111   '
'  916-2221111 '

Now I need to add an additional regex to generate a Value Error for numbers such as

s = '916 111-2222' # there are white spaces between the area code and a local number and NO ')'

I tried

reg_ph = re.match(r'^\s*\(?(\d{3})\)?\s*-? *(\d{3})-? *-?(\d{4})', s)
reg_ph = re.match(r'^\s*\(?(\d{3})\)?s*-? *(\d{3})-? *-?(\d{4})', s)

but non rejects the string in question

I will greatly appreciate any ideas. I am very new to Regex!

CodePudding user response:

In Python re you could use a conditional to check for group 1 having the opening parenthesis.

If that is the case match the closing parenthesis, optional spaces and 3 digits. Else match - and 3 digits.

If you use re.match you can omit ^

^\s*(\()?\d (?(1)\)\s*\d{3}|-\d{3})-?\d{4}

If you want to match the whole string and trailing whitespace chars:

^\s*(\()?\d (?(1)\)\s*\d{3}|-\d{3})-?\d{4}\s*$

In parts, the pattern matches:

  • ^ Start of string
  • \s* Match optional whitespace chars
  • (\()? Optional group 1, match (
  • \d Match 1 digits
  • (? Conditional
    • (1)\)\s*\d{3} If group 1 exist, match the closing ), optional whitespace chars and 3 digits
    • | Or
    • -? Match optional -
    • \d{3} Match 3 digits
  • ) close conditional
  • -?\d{4} Match optional - and 4 digits

See a regex demo

For example, using capture groups in the pattern to get the digits:

import re

strings = [' (916) 111-2222',' 916-2221111 ', '916 111-2222']
pattern =r'\s*(\()?(\d )(?(1)\)\s*(\d{3})|-(\d{3}))-?(\d{4})\s*$'

for item in strings:
  m=re.match(pattern, item)
  if m:
    t = tuple(s for s in m.groups() if s is not None and s.isdigit())
    print(t)
  else:
    print("no match for "   item)

Output

('916', '111', '2222')
('916', '222', '1111')
no match for 916 111-2222

Python demo

  • Related