Home > Net >  How to split string by bracketed and non bracketed tokens python?
How to split string by bracketed and non bracketed tokens python?

Time:05-25

There is a string of words divided by spaces, and several words are united with square brackets. For example

word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)

I want to split it by spaces, but count words in brackets as single word, so for the example above result would be

[word1, word1, (word3 word4 word5), word6, (word7 word8), word9, word10, word11, (word12 word13 word14)]

With or without brackets words will be in resulting list is not important. What matter is to count words in brackets as single. How can I do it?

CodePudding user response:

I suppose you could make a regex that looks for either anything within parenthesis or a word-characters. That might look something like:

import re

s = 'word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)'

re.findall(r'(?:\(.*?\))|(?:\w )', s)

Which will give you:

['word1',
 'word2',
 '(word3 word4 word5)',
 'word6',
 '(word7 word8)',
 'word9',
 'word10',
 'word11',
 '(word12 word13 word14)']
  • Related