Home > other >  Python regex: validate mobile numbers from Germany and Austria
Python regex: validate mobile numbers from Germany and Austria

Time:12-09

I have worked on a regex which matches German and Austrian mobile numbers. I was not able to complete it.

This is what I have so far:

[^\d]((\ 49|0049|0|\ 43|0043)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)

You can check the performance in my regex-demo.

  • The international dialing code begins with 49 for Germany and 43 for Austria. This is covered in my regex.
  • The length of the mobile numbers varies on the way you write the phone number. I collected multiple examples demo.

Question: How to improve the regex in order to match all examples of my demo?

Furthermore, I want to check if a specific number matches the defined regex. However, my approach doen't really work:

import re
slot_value = "0176 48200179"
regex_mobile = r"[^\d]((\ 49|0049|0|\ 43|0043)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)" 

match = re.fullmatch(regex_mobile, slot_value)
print(match)

>>> None

CodePudding user response:

One option could be asserting 10-12 digits after matching the first part of the pattern with the variations of the and the parenthesis.

Note that you can write (1|9) as [19] and if you don't need the capture group value, you can omit the parenthesis and at the beginning of the pattern you might also shorten the alternatives using a character class as well like 4[39]|004[39]

I have started the pattern with an anchor ^ as your pattern starts with [^\d] that actually consumes a character.

^(?:\ 4[39]|004[39]|0|\ \(49\)|\(\ 49\))\s?(?=(?:[^\d\n]*\d){10,12}(?!\d))(\()?[19][1567]\d{1,2}(?(1)\))\s?\d(?:[ /-]?\d) 

Regex demo

CodePudding user response:

If you want to match all numbers, when :

  • Germany number are : 49 , 0049, (49)
  • Austria number are : 43 , 0043, (43)

Your Regex will be :

.*(?:\ 49|0049|\ \(49\)|\ 43|0043|\ \(43\)).*

Demo

You can use finditer to find all numbers.

Below an example:

import re

regex = r".*(?:\ 49|0049|\ \(49\)|\ 43|0043|\ \(43\)).*"

test_str = ("################################### This is allowed ##########################\n"
    " 49 15207930698\n"
    " 49 15207955279\n"
    " 49 1739341284\n"
    " 49 1626589266\n\n"
    " 49915175461907\n"
    " 4915207930698\n"
    " 491635556416\n"
    "017687400179\n"
    "015903900297\n"
    "015175355164\n"
    "015175354885\n"
    "01771789427\n\n"
    " 49 915175461907\n"
    " 43 915175461907\n"
    " 49159039012341\n"
    " 43159039012341\n"
    " 4915207829969\n"
    " 4917697400179\n"
    " 4915903904567\n"
    " 4915902944599\n"
    " 4915902944599\n"
    " 4915903904567\n"
    " 491739341284\n"
    " 431739341284\n\n"
    " 49 176 97 456 123\n"
    "0176 79 123 17 9\n"
    "0176 97 50 01 79\n"
    "0176 79 123 179\n"
    "0174 80123179\n\n"
    "0049 915175461907\n"
    "0043 915175461907\n"
    "0049159039012341\n"
    "0043159039012341\n"
    "004915207829969\n\n"
    "( 49) 17697123456\n"
    " (49) (1739) 34 12 84\n"
    " 49 (1739) 34-12-84\n\n"
    "############################################################################################\n"
    "################################### This is NOT allowed ####################################\n"
    "012345678901234\n"
    "123w345345345345\n"
    "0123456789101191919\n\n"
    "### Too short\n"
    " 49 15902\n"
    " 49 1590123\n"
    " 49 15903567\n"
    " 49 177178796\n"
    " 49 757130309\n\n"
    " 4915902\n"
    " 491590123\n"
    " 4915903567\n"
    " 49177178796\n"
    " 49757130309\n\n"
    "### Too long\n"
    " 49 1590345985412\n"
    " 491590345985412\n\n"
    "### Not German and not Austrain format\n"
    " 12127319863\n"
    " 13322014056\n"
    " 12126712234\n"
    " 427532697710\n"
    " 417868150810\n"
    " 287533002875\n")

matches = re.finditer(regex, test_str, re.MULTILINE)

for matchNum, match in enumerate(matches, start=1):
    
    print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
    
    for groupNum in range(0, len(match.groups())):
        groupNum = groupNum   1
        
        print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))
  • Related