I try to split strings based on commas with avoiding the ones within the double quotes.Then I need to add those split strings to the list.
line = "DATA", "LT", "0.40", "1.25", "Sentence, which contain,
commas", "401", "", "MN", "", "", "", "", ""
when I try to do it with
lineItems = line.split(",")
It splits based on all commas.
Conversely, when I use regex to split, I get all elements as one element on the list. (can not split them).
Is there any chance to get:
newlist = ['DATA', 'LT', '0.40', '1.25', 'Sentence, which contain,
commas', '401', '', 'MN', '', '', '', '', '']
Thanks!
P.S I will have many similar rows so I want to get a similar result from all via iteration.
CodePudding user response:
You could use the shlex
in-built module, like so
import shlex
line = '"DATA", "LT", "0.40", "1.25", "Sentence, which contain, commas", "401", "", "MN", "", "", "", "", ""'
newlist = [x[:-1] for x in shlex.split(line)]
CodePudding user response:
You mentioned you tried to split a 'string' variable. Therefor I assume you forgot to add the appropriate quotes. Is the following helpfull, assuming balanced double quotes?
import regex as re
line = """ "DATA", "LT", "0.40", "1.25", "Sentence, which contain,
commas", "401", "", "MN", "", "", "", "", "" """
l = re.findall(r'"([^"]*)"', line)
print(l)
Prints:
['DATA', 'LT', '0.40', '1.25', 'Sentence, which contain, \ncommas', '401', '', 'MN', '', '', '', '', '']