I have a string which is:
str2s = 'orange,juices,apple,apple[-2]'
I'm trying to extract all those words as long as the bracket out, so I want:
'orange', 'juices', 'apple', 'apple[-2]'
I tried using:
re.findall(
'[A-Za-z][A-Za-z0-9_%\\.]{0,}\[?[a-zA-Z0-9_]*\]?',
str2s,
flags=re.IGNORECASE
)
But it only returned:
'orange', 'juices', 'apple', 'apple['
How to get the -2]
as well?
CodePudding user response:
In order to split an string to a list I think you have to know the exact separator, or being able to identify those separators, being it ,
or [,.]
or others.
If you can't define separators from items, I think it will be very hard to achieve your goal via common methods.
With that being said, in your case of orange,juices,apple,apple[-2]
, you may use r'([\w\[\]\-] )'
https://regex101.com/r/2Xj7AR/1
CodePudding user response:
The following code will extract the words the way you want:
import re
words = re.compile(r'\w \[?-?\d*\]?', re.IGNORECASE)
s = 'orange,juices,apple,apple[-2],pineapple[20]'
words.findall(s)
Which will result in the following:
['orange', 'juices', 'apple', 'apple[-2]', 'pineapple[20]']
Bear in mind that the snippet above was written with the example string you provided as the base. If you need to match other types of words (007