I'm new to python. I'm trying to read a csv file into a dictionary but it is returning the dictionary with the key twice, the first time as the key word, and the second time as one of the columns. Has anyone any idea on how to remove the first column that is already considered to be key?
Here is my code:
def read_dict(filename, key_column_index):
"""Read the contents of a CSV file into a compound
dictionary and return the dictionary.
Parameters
filename: the name of the CSV file to read.
key_column_index: the index of the column
to use as the keys in the dictionary.
Return: a compound dictionary that contains
the contents of the CSV file.
"""
# Create an empty dictionary that will store the data from the CSV file.
csv_dict = {}
with open(filename, "rt") as csv_file:
# Use the csv module to create a reader object that will read from the opened CSV file.
reader = csv.reader(csv_file)
# Skip the first row of data as it contains the header of each column
next(reader)
# Read the rows in the CSV file one row at a time.
# The reader object returns each row as a list.
for row_list in reader:
# From the current row, retrieve the data from the column that contains the key.
key = row_list[key_column_index]
# Store the data from the current row into the dictionary.
csv_dict[key] = row_list
return csv_dict
CodePudding user response:
This will skip over the column at key_column_index
using list slicing:
key = row_list[key_column_index]
csv_dict[key] = row_list[:key_column_index] row_list[key_column_index 1:]
CodePudding user response:
You can use this feature
import csv
f = open("samp.csv","r")
reader = csv.reader(f)
d = {}
d_reader = csv.DictReader(f)
for l in (d_reader):
print(l)
The returned dictionary is in the form of json records. A list of dictionaries where the key is the column name
Or you can do this if you want the whole column as a list under each column name which serves as the key
import csv
f = open("samp.csv","r")
reader = csv.reader(f)
d = {}
for ri,r in enumerate(reader):
if ri == 0:
column_names = list(r)
if ri > 0:
for ci, c in enumerate(r):
curr_cname = column_names[ci]
if curr_cname not in d:
d[curr_cname] = []
d[curr_cname].append(c)
print(d) # d = {'a': ['1', '5'], 'b': ['2', '6'], 'c': ['3', '7']}
f.close()