I want to create a list of dictionary from a csv without using any inbuilt functions. I need my function to return something like: [{col1: [values]}, {col2: [values]}, ...]
Here's my code snippet:
result = []
with open(filename, 'r') as f:
headers = f.readline().strip().split(',')
for line in f:
line_values = line.strip().split(',')
for i, header in enumerate(headers):
row = {}
row[header] = line_values[i]
result.append(row)
But I get a result like: [{'col1': '16'}, {'col2': '3779'}, {'col3': '60.5'}, ....]
CodePudding user response:
I'm guessing that rather than a list of dictionaries, you want a single dictionary where each key is a column name and each value is a list of the values in that column. Here's how to get that:
with open('/tmp/data.csv', 'r') as f:
headers = f.readline().strip().split(',')
result = {k:[] for k in headers}
for line in f:
line_values = line.strip().split(',')
if len(line_values) == len(headers):
for i, header in enumerate(headers):
result[header].append(line_values[i])
print(result)
With that code and this input:
Col1,Col2,Col3
row1c1,row1c2,row1c3
row2c1,row2c2,row2c3
row3c1,row3c2,row3c3
You get:
{'Col1': ['row1c1', 'row2c1', 'row3c1'], 'Col2': ['row1c2', 'row2c2', 'row3c2'], 'Col3': ['row1c3', 'row2c3', 'row3c3']}
If you really want the format that you show in your example, you can convert the result to that as follows:
result = [{k: v} for k, v in result.items()]
Which gives you:
[{'Col1': ['row1c1', 'row2c1', 'row3c1']}, {'Col2': ['row1c2', 'row2c2', 'row3c2']}, {'Col3': ['row1c3', 'row2c3', 'row3c3']}]
The first result is more useful, as you can easily look up the values for a column via result[<column name>]
. With the second version, you would need to iterate over each of the values in the list and look for the dictionary that contains a key that is the name of the column you're looking for. In this latter case, the inner dictionaries aren't doing you any good and are just making lookup harder and less efficient.
NOTE: Even if you really do want the latter format of the result, you would still compute that result in this same way.