I have the following complete code example
import re
examples = [
"D1", # expected: ('1')
"D1sjdgf", # ('1')
"D1.2", # ('1', '2')
"D1.2.3", # ('1', '2', '3')
"D3.10.3x", # ('3', '10', '3')
"D3.10.11" # ('3', '10', '11')
]
for s in examples:
result = re.search(r'^D(\d )(?:\.(\d )(?:\.(\d )))', s)
print(s, result.groups())
where I want to match the 1, 2 or 3 numbers in the expression always starting with the letter "D". It could be 1 of them, or 2, or three. I am not interested in anything after the last digit.
I would expect that my regex would match e.g. D3.10.3x
and return ('3','10','3')
, but instead returns only ('3',)
. I do not understand why.
^D(\d \)(?:\.(\d )(?:\.(\d )))
^D
matches "D" at the start\d
matches the first one-digit number inside a group.(?:
starts a non-matching group. I do not want to get this group back.\.
A literal point(\d )
A group of one or more numbers I want to "catch"
I also do not know what a "non-capturing" group means in that context as for this answer.
CodePudding user response:
You may use this regex solution with a start anchor and 2 capture groups inside the nested optional capture groups:
^D(\d )(?:\.(\d )(?:\.(\d ))?)?
Explanation:
^
: StartD
: Match letterD
(\d )
: Match 1 digits in capture group #1(?:
: Start outer non-capture group\.
: Match a dot(\d )
: Match 1 digits in capture group #2(?:
: Start inner non-capture group\.
: Match a dot(\d )
: Match 1 digits in capture group #3
)?
: End inner optional non-capture group
)?
: End outer optional non-capture group
import re
examples = [
"D1", # expected: ('1')
"D1sjdgf", # ('1')
"D1.2", # ('1', '2')
"D1.2.3", # ('1', '2', '3')
"D3.10.3x", # ('3', '10', '3')
"D3.10.11" # ('3', '10', '11')
]
rx = re.compile(r'^D(\d )(?:\.(\d )(?:\.(\d ))?)?')
for s in examples:
result = rx.search(s)
print(s, result.groups())
Output:
D1 ('1', None, None)
D1sjdgf ('1', None, None)
D1.2 ('1', '2', None)
D1.2.3 ('1', '2', '3')
D3.10.3x ('3', '10', '3')
D3.10.11 ('3', '10', '11')