There is a string of words divided by spaces, and several words are united with square brackets. For example
word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)
I want to split it by spaces, but count words in brackets as single word, so for the example above result would be
[word1, word1, (word3 word4 word5), word6, (word7 word8), word9, word10, word11, (word12 word13 word14)]
With or without brackets words will be in resulting list is not important. What matter is to count words in brackets as single. How can I do it?
CodePudding user response:
I suppose you could make a regex that looks for either anything within parenthesis or a word-characters. That might look something like:
import re
s = 'word1 word2 (word3 word4 word5) word6 (word7 word8) word9 word10 word11 (word12 word13 word14)'
re.findall(r'(?:\(.*?\))|(?:\w )', s)
Which will give you:
['word1',
'word2',
'(word3 word4 word5)',
'word6',
'(word7 word8)',
'word9',
'word10',
'word11',
'(word12 word13 word14)']