I want to match cc dd
that doesn't start with aa
import re
s = 'bb cc dd ee\naa : bb cc dd ee\n11 cc dd ee'
pp = re.compile(r'(?P<n1>ee)|(?P<n2>^(?!aa\b).*\bcc dd\b)', re.MULTILINE)
def _rep(x):
print(x.groupdict())
return [f'<{k}>' for k, v in x.groupdict().items() if v is not None][0]
rr = pp.sub(_rep, s)
print(rr)
Result: Current
# print(x.groupdict())
{'n1': None, 'n2': 'bb cc dd'}
{'n1': 'ee', 'n2': None}
{'n1': 'ee', 'n2': None}
{'n1': None, 'n2': '11 cc dd'}
{'n1': 'ee', 'n2': None}
# print(rr)
<n2> <n1>
aa : bb cc dd <n1>
<n2> <n1>
Result: I want ..
# print(x.groupdict())
{'n1': None, 'n2': 'cc dd'}
{'n1': 'ee', 'n2': None}
{'n1': 'ee', 'n2': None}
{'n1': None, 'n2': 'cc dd'}
{'n1': 'ee', 'n2': None}
# print(rr)
bb <n2> <n1>
aa : bb cc dd <n1>
11 <n2> <n1>
Please help me..
CodePudding user response:
With re
, it won't be possible to achieve what you need because you expect multiple occurrences per string that will be replaced later, and you need a variable-width lookbehind pattern support (not available in re
).
You need to install the PyPi regex
module by launching pip install regex
in your terminal/console and then use
import regex
s = 'bb cc dd ee\naa : bb cc dd ee\n11 cc dd ee'
pp = regex.compile(r'(?P<n1>ee)|(?<!^aa\b.*)\b(?P<n2>cc dd)\b', regex.MULTILINE)
def _rep(x):
#print(x.groupdict())
return [f'<{k}>' for k, v in x.groupdict().items() if v is not None][0]
rr = pp.sub(_rep, s)
print(rr)
See the Python demo.
Here, (?<!^aa\b.*)\b(?P<n2>cc dd)\b
matches a whole word cc dd
capturing it into n2
group that is not immediately preceded with aa
whole word at the beginning of the current line (regex.MULTILINE
with ^
make this anchor match any line start position and .*
makes sure the check is performed even if cc dd
is not immediately preceded with aa
.