I am currently doing the US Medical Insurance Cost Portfolio Project through Code Academy and am having trouble combining two lists into a single dictionary. I created two new lists (smoking_status
and insurance_costs
) in hope of investigating how insurance costs differ between smokers and non-smokers. When I try to zip these two lists together, however, the new dictionary only has two components. It should have well over a thousand. Where did I go wrong? Below is my code and output. It is worth nothing that the output seen is the last two data points in my original csv file.
import csv
insurance_db =
with open('insurance.csv',newline='') as insurance_csv:
insurance_reader = csv.DictReader(insurance_csv)
for row in insurance_reader:
insurance_db.append(row)
smoking_status = []
for person in insurance_db:
smoking_status.append(person.get('smoker'))
insurance_costs = []
for person in insurance_db:
insurance_costs.append(person.get('charges'))
smoker_dictionary = {key:value for key,value in zip(smoking_status,insurance_costs)}
print(smoker_dictionary)
Output:
{'yes': '29141.3603', 'no': '2007.945'}
CodePudding user response:
A key can’t be presented twice in a dictionary. If I understand correctly, there are 2 statuses, “yes” and “no”. If so, then the “correct” dictionary structure would be:
{'yes':[...], 'no': [...]}
You can create an empty dictionary as:
{status: [] for status in set(smoking_status)}
And then run on your zipped list and append to the correct key.
CodePudding user response:
Why are you looping three times over basically the input anyway?
entries = []
with open('insurance.csv',newline='') as insurance_csv:
for row in csv.DictReader(insurance_csv):
entries.append([row["smoker"], row["charges"]])
We can't guess what your expected output should be; zipping two lists should create one list with the rows in each list joined up, so that's what this code does. If you want a dictionary instead, or a list of dictionaries, you'll probably want to combine the data from each input row differently, but the code should otherwise be more or less this.
To spell this out, if your input is
id,smoker,charges
1,no,123
2,no,567
3,yes,987
then the output will be
[["no", 123],
["no", 567],
["yes", 987]]