I have worked on a regex which matches German and Austrian mobile numbers. I was not able to complete it.
This is what I have so far:
[^\d]((\ 49|0049|0|\ 43|0043)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)
You can check the performance in my regex-demo.
- The international dialing code begins with
49
for Germany and43
for Austria. This is covered in my regex. - The length of the mobile numbers varies on the way you write the phone number. I collected multiple examples demo.
Question: How to improve the regex in order to match all examples of my demo?
Furthermore, I want to check if a specific number matches the defined regex. However, my approach doen't really work:
import re
slot_value = "0176 48200179"
regex_mobile = r"[^\d]((\ 49|0049|0|\ 43|0043)\s?(1|9)[1567]\d{1,2}([ \-/]*\d){7,8})(?!\d)"
match = re.fullmatch(regex_mobile, slot_value)
print(match)
>>> None
CodePudding user response:
One option could be asserting 10-12 digits after matching the first part of the pattern with the variations of the
and the parenthesis.
Note that you can write (1|9)
as [19]
and if you don't need the capture group value, you can omit the parenthesis and at the beginning of the pattern you might also shorten the alternatives using a character class as well like 4[39]|004[39]
I have started the pattern with an anchor ^
as your pattern starts with [^\d]
that actually consumes a character.
^(?:\ 4[39]|004[39]|0|\ \(49\)|\(\ 49\))\s?(?=(?:[^\d\n]*\d){10,12}(?!\d))(\()?[19][1567]\d{1,2}(?(1)\))\s?\d(?:[ /-]?\d)
CodePudding user response:
If you want to match all numbers, when :
- Germany number are : 49 , 0049, (49)
- Austria number are : 43 , 0043, (43)
Your Regex will be :
.*(?:\ 49|0049|\ \(49\)|\ 43|0043|\ \(43\)).*
You can use finditer to find all numbers.
Below an example:
import re
regex = r".*(?:\ 49|0049|\ \(49\)|\ 43|0043|\ \(43\)).*"
test_str = ("################################### This is allowed ##########################\n"
" 49 15207930698\n"
" 49 15207955279\n"
" 49 1739341284\n"
" 49 1626589266\n\n"
" 49915175461907\n"
" 4915207930698\n"
" 491635556416\n"
"017687400179\n"
"015903900297\n"
"015175355164\n"
"015175354885\n"
"01771789427\n\n"
" 49 915175461907\n"
" 43 915175461907\n"
" 49159039012341\n"
" 43159039012341\n"
" 4915207829969\n"
" 4917697400179\n"
" 4915903904567\n"
" 4915902944599\n"
" 4915902944599\n"
" 4915903904567\n"
" 491739341284\n"
" 431739341284\n\n"
" 49 176 97 456 123\n"
"0176 79 123 17 9\n"
"0176 97 50 01 79\n"
"0176 79 123 179\n"
"0174 80123179\n\n"
"0049 915175461907\n"
"0043 915175461907\n"
"0049159039012341\n"
"0043159039012341\n"
"004915207829969\n\n"
"( 49) 17697123456\n"
" (49) (1739) 34 12 84\n"
" 49 (1739) 34-12-84\n\n"
"############################################################################################\n"
"################################### This is NOT allowed ####################################\n"
"012345678901234\n"
"123w345345345345\n"
"0123456789101191919\n\n"
"### Too short\n"
" 49 15902\n"
" 49 1590123\n"
" 49 15903567\n"
" 49 177178796\n"
" 49 757130309\n\n"
" 4915902\n"
" 491590123\n"
" 4915903567\n"
" 49177178796\n"
" 49757130309\n\n"
"### Too long\n"
" 49 1590345985412\n"
" 491590345985412\n\n"
"### Not German and not Austrain format\n"
" 12127319863\n"
" 13322014056\n"
" 12126712234\n"
" 427532697710\n"
" 417868150810\n"
" 287533002875\n")
matches = re.finditer(regex, test_str, re.MULTILINE)
for matchNum, match in enumerate(matches, start=1):
print ("Match {matchNum} was found at {start}-{end}: {match}".format(matchNum = matchNum, start = match.start(), end = match.end(), match = match.group()))
for groupNum in range(0, len(match.groups())):
groupNum = groupNum 1
print ("Group {groupNum} found at {start}-{end}: {group}".format(groupNum = groupNum, start = match.start(groupNum), end = match.end(groupNum), group = match.group(groupNum)))