The string comes in as "<item1><item2><item3>"
I would like to convert it to ['item1','item2','item3']
I was thinking of using string.split('><') and then stripping out the first < and the last >.
But this wouldn't work well if the string has things in front or after it or if there is a space in between the '>' and the '<'. Is there another way?
CodePudding user response:
Use regex with a capturing group ()
import re
s = '<item1><item2><item3>'
re.findall('<(\w )>', s)
Output
['item1', 'item2', 'item3']
CodePudding user response:
You can use this regular expression if your strings (item1
, item2
, etc.) don't contain a <
or >
Regex: (Try it on Regex101)
<(.*?)>
Explanation:
< Match a literal <
( ) Capturing group
.*? Match any character, any number of times using a lazy match
> Match a literal >
To run this with python, do:
import re
s = '<item1><item2><item3>'
re.findall('<(.*?)>', s)
Which gives your expected output
['item1', 'item2', 'item3']