Home > database >  Splitting String Using RE findall () in Python using ']]' as delimiter but keeping the del
Splitting String Using RE findall () in Python using ']]' as delimiter but keeping the del

Time:09-07

Simplest way to explain will be I have this code,

Str = 'Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5], [-3, 5.5, 0, 9.5]]Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5], [-3, 9.5, 0, 13.5]]Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5], [-3, 9.5, 0, 13.5]]'
from re import findall

findall ('[^\]\]] \]\]?', Str)

What I get is,

['Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5]',
 ', [-3, 5.5, 0, 9.5]]',
 'Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5]',
 ', [-3, 9.5, 0, 13.5]]',
 'Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5]',
 ', [-3, 9.5, 0, 13.5]]']

I assume it's taking only single ']' instead of ']]' when splitting, I want result as below,

['Floor_Live_Patterened_SpanPairs_1: [[-3, 0, 0, 5.5], [-3, 5.5, 0, 9.5]]',
 'Floor_Live_Patterened_SpanPairs_2: [[-3, 0, 0, 5.5], [-3, 9.5, 0, 13.5]]',
 'Floor_Live_Patterened_SpanPairs_3: [[-3, 5.5, 0, 9.5], [-3, 9.5, 0, 13.5]]']

I have gone through the documentation but couldn't work out how to achieve this or what modification should be done in above using regex findall function, a similar technique was adopted in one of answers in In Python, how do I split a string and keep the separators?

CodePudding user response:

Looks like you could just use a lazy dot . ? until ]] to get your desired matches.

. ?\]\]

See this demo at regex101 (explanation on the right side)

Be aware that [^\]\]] does not match substrings that are not ]]. Would be the same functioning if you removed one of the closing brackets. Read more about character classes here.

CodePudding user response:

Since you're trying to match balanced bracket constructs, a more robust solution would be to use a regex engine that supports recursion, such as the regex module, and use the (?R) pattern to recursively match balanced pairs of brackets:

import regex

regex.findall(r'.*?\[(?>[^[\]]|(?R))*\]', Str)

Demo: https://replit.com/@blhsing/TroubledTurboLint

  • Related