Home > Back-end >  Python: separating words using space, but preserving double quotes surrounded text as single unit
Python: separating words using space, but preserving double quotes surrounded text as single unit

Time:02-03

Let's say I have a string that looks like this:

one two three "four five"

I'd like to split such that I get an array:

['one', 'two', 'three', 'four five']

using split with ' ' will not be enough here. I have to separate out the double quotes first. Is there a best practice technique to do this? or should I re-invent the wheel and do it myself?

CodePudding user response:

Use the following regex matching:

import re

s = 'one two three "four five"'
words = re.findall(r'(\w |"[^"] ")', s)
print(words)
  • (a|b) - matches either what is before the | or what is after it
  • \w - match any word
  • "[^"] " - match any sequence of characters except " which is surrounded by quotes

['one', 'two', 'three', '"four five"']

CodePudding user response:

With "should I re-invent the wheel and do it myself?" as primary requirement we can reuse something meant for something else

import csv
import io
s='one two three "four five"'
f=csv.reader(io.StringIO(s),dialect="excel",delimiter=" ")
for i in f:
    print(i)

['one', 'two', 'three', 'four five']
  • Related