If i have a string like
b*&^6bolyb{[--9_(marker1JN9&[7&9bkey=- )*.,mljmarker2,pi*[80[)(Mmp0oiymarker1ojm)*[marker2,;i0m980m.9u090marker1*(7hp0Key0()mu90marker2
how do i extract the part between marker1
and marker2
if it contains key
(or 'Key' or any other variation in case) ?
So i'd like to have the code return:
['JN9&[7&9bkey=- )*.,mlj', '*(7hp0Key0()mu90']
CodePudding user response:
We can use re.findall
here:
inp = "b*&^6bolyb{[--9_(marker1JN9&[7&9bkey=- )*.,mljmarker2,pi*[80[)(Mmp0oiymarker1ojm)*[marker2,;i0m980m.9u090marker1*(7hp0key0()mu90marker2"
matches = re.findall(r'marker1(?:(?!marker[12]).)*[kK]ey(?:(?!marker[12]).)*marker2', inp)
print(matches) # ['marker1JN9&[7&9bkey=- )*.,mljmarker2', 'marker1*(7hp0key0()mu90marker2']
The regex pattern used above ensures that we match a marker1 ... key ... marker2
sequence without crossing over more than one marker1
or marker2
boundary:
marker1
match "marker1"(?:(?!marker[12]).)*
match any content WITHOUT crossing a "boundary1" or "boundary2" marker[kK]ey
match "key" or "Key"(?:(?!marker[12]).)*
again match without crossing a markermarker2
match "marker2"