Simplest way to explain will be I have this code,
Str = 'Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5], [-3, 5.5, 0, 9.5]]Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5], [-3, 9.5, 0, 13.5]]Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5], [-3, 9.5, 0, 13.5]]'
from re import findall
findall ('[^\]\]] \]\]?', Str)
What I get is,
['Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5]',
', [-3, 5.5, 0, 9.5]]',
'Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5]',
', [-3, 9.5, 0, 13.5]]',
'Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5]',
', [-3, 9.5, 0, 13.5]]']
I assume it's taking only single ']' instead of ']]' when splitting, I want result as below,
['Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5], [-3, 5.5, 0, 9.5]]',
'Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5], [-3, 9.5, 0, 13.5]]',
'Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5], [-3, 9.5, 0, 13.5]]']
I have gone through the documentation but couldn't work out how to achieve this or what modification should be done in above using regex findall function, a similar technique was adopted in one of answers in In Python, how do I split a string and keep the separators?
CodePudding user response:
Looks like you could just use a lazy dot . ?
until ]]
to get your desired matches.
. ?\]\]
See this demo at regex101 (explanation on the right side)
Be aware that [^\]\]]
does not match substrings that are not ]]
. Would be the same functioning if you removed one of the closing brackets. Read more about character classes here.
CodePudding user response:
Since you're trying to match balanced bracket constructs, a more robust solution would be to use a regex engine that supports recursion, such as the regex module, and use the (?R)
pattern to recursively match balanced pairs of brackets:
import regex
regex.findall(r'.*?\[(?>[^[\]]|(?R))*\]', Str)