I have this csv:
Type,ID,Value1,Value2,Name,Text
TypeA,1231,"value1,value2,value3","value7,value8, value9",name1,
TypeA,2123,,,name2,textA
TypeA,4242,,,name3,
TypeA,5135,,,name4,
TypeA,2123,,,name5,
TypeA,7525,,,name6,
TypeA,6869,value4,,name7,
TypeB,9654,"value5, value6",,name8,textB
TypeB,3225,,,name9,
TypeB,6545,,value10,name10,
how do I make it into a dictionary with some lists if there is more than one value? i've tried this:
with open(csv_file,'r') as f:
csv_list = [[val.strip() for val in r.split(",")] for r in f.readlines()]
(_, *header), *data = csv_list
print(csv_list)
csv_dict = {}
for row in data:
key, *values = row
if key not in csv_dict:
csv_dict[key] = []
csv_dict[key].append({key: value for key, value in zip(header, values)})
For example, I want csv_dict['TypeB'][0]
to print :
{'ID': '9654', 'Value1': ["value5, value6"], 'Value2': [], 'Name': 'name8', 'Text': 'textB'}
But it prints:
{'ID': '9654', 'Value1': '"value5', 'Value2': 'value6"', 'Name': '', 'Text': 'name8'}
CodePudding user response:
Try:
import csv
with open("your_file.csv", "r") as f_in:
reader = csv.DictReader(f_in)
data = list(reader)
for row in data:
row["Value1"] = [
ss for s in row["Value1"].split(",") if (ss := s.strip())
]
row["Value2"] = [
ss for s in row["Value2"].split(",") if (ss := s.strip())
]
print(data)
Prints:
[
{
"Type": "TypeA",
"ID": "1231",
"Value1": ["value1", "value2", "value3"],
"Value2": ["value7", "value8", "value9"],
"Name": "name1",
"Text": "",
},
{
"Type": "TypeA",
"ID": "2123",
"Value1": [],
"Value2": [],
"Name": "name2",
"Text": "textA",
},
{
"Type": "TypeA",
"ID": "4242",
"Value1": [],
"Value2": [],
"Name": "name3",
"Text": "",
},
...and so on.
CodePudding user response:
Read your file using csv.DictReader
instead of manually splitting lines on commas. csv.DictReader
takes care of commas escaped by quotes.
with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
for data in reader:
print(data)
Which creates dicts for each row in your file, with the fields in quotes read as a single string, like so:
{'Type': 'TypeA', 'ID': '1231', 'Value1': 'value1,value2,value3', 'Value2': 'value7,value8, value9', 'Name': 'name1', 'Text': ''}
Now, since you want your Value1
item to be a list, you could split it by commas if the value contains a comma.
csv_dict = {}
with open(csv_file, 'r') as f:
reader = csv.DictReader(f)
for data in reader:
# Overwrite with split result if data["Value1"] is not an empty string
# Else, make an empty list
data["Value1"] = data["Value1"].split(",") if data["Value1"] else []
data["Value2"] = data["Value2"].split(",") if data["Value2"] else []
if data["Type"] not in csv_dict:
csv_dict[data["Type"]] = [data]
else:
csv_dict[data["Type"]].append(data)
Now, you have csv_dict["TypeB"][0]
is:
{'Type': 'TypeB',
'ID': '9654',
'Value1': ['value5', ' value6'],
'Value2': [],
'Name': 'name8',
'Text': 'textB'}
CodePudding user response:
Generally, if you use a simple r.split(",") it doesn't know anything about csv files (such as the fact that cells with multiple commas are marked with ""). Try using the csv library: https://docs.python.org/3/library/csv.html
Similar question: How do I read and write CSV files with Python?