text:
text1 = 'xx(aa)(bb)xx'
text2 = 'xx(aa(bb))xx'
expectation:
('aa', 'bb')
('aa(bb)', 'bb')
My approach, but it does not meet expectations.
re.compile(r'\(\s?(. ?)\s?\)')
CodePudding user response:
You can install the PyPi regex
module and use
import regex
texts = ['xx(aa)(bb)xx', 'xx(aa(bb))xx']
rx = r'\(((?:[^()] |(?R))*)\)'
for text in texts:
print(regex.findall(rx, text, overlapped=True))
See the Python demo. Output:
['aa', 'bb']
['aa(bb)', 'bb']
The \(((?:[^()] |(?R))*)\)
regex is a common PCRE compliant regex that matches strings between nested paired parentheses, I added a capturing group for contents in between the brackets.
To get all overlapping parentheses, the overlapped=True
option is passed to regex.findall
.