Let's say I have a string that looks like this:
one two three "four five"
I'd like to split such that I get an array:
['one', 'two', 'three', 'four five']
using split
with ' '
will not be enough here. I have to separate out the double quotes first. Is there a best practice technique to do this? or should I re-invent the wheel and do it myself?
CodePudding user response:
Use the following regex matching:
import re
s = 'one two three "four five"'
words = re.findall(r'(\w |"[^"] ")', s)
print(words)
(a|b)
- matches either what is before the|
or what is after it\w
- match any word"[^"] "
- match any sequence of characters except"
which is surrounded by quotes
['one', 'two', 'three', '"four five"']
CodePudding user response:
With "should I re-invent the wheel and do it myself?" as primary requirement we can reuse something meant for something else
import csv
import io
s='one two three "four five"'
f=csv.reader(io.StringIO(s),dialect="excel",delimiter=" ")
for i in f:
print(i)
['one', 'two', 'three', 'four five']