Home > Net >  how to group a file into a dictionary without importing
how to group a file into a dictionary without importing

Time:09-10

I'm having to make a dictionary from a file that looks like this:

example = 

    'Computer science', random name, 17
    'Computer science', another name, 18
    'math', one name, 19

I want the majors to be keys but I'm having trouble grouping them this is what I've tried

dictionary = {}
for i in example_file:
     dictionary = {example[0]:{example[1] : example[2]}

the problem with this is that it does turn the lines into a dictionary but one by one instead of having the ones with the same key in one dictionary this is what its returning:

{computer science; {random name: 17}}
{computer science: {another name: 18}}
{math{one name:19}}

this is how I want it to look

{computer science: {random name: 17, another name: 18}, math:{one name:19}}

how do I group these?

CodePudding user response:

You need to update the dictionary elements, not assign the whole dictionary each time through the loop.

You can use defaultdict(dict) to automatically create the nested dictionaries as needed.

from collections import defaultdict

dictionary = defaultdict(dict)

for subject, name, score in example_file:
    dictionary[subject][name] = int(score)

CodePudding user response:

It's a pretty well known problem with an elegant solution, making use of dict's setdefault() method.

dictionary = {}
for example in example_file:
    names = dictionary.setdefault(example[0], {})
    names[example[1]] = example[2]
    
print(dictionary)

This code prints:

{'Computer science': {'random name': 17, 'another name': 18}, 'math': {'one name': 19}}

CodePudding user response:

An alternative code:
(but @hhimko 's solution is almost 50 times faster)

import pandas as pd
df = pd.read_csv("file.csv", header=None).sort_values(0).reset_index(drop=True)
result = dict()
major_holder = None
for index, row in tt.iterrows():
    if row.iloc[0] != major_holder:
        major_holder = row.iloc[0]
        result[major_holder] = dict()
        result[major_holder][row.iloc[1]] = row.iloc[2]
    else:
        result[major_holder][row.iloc[1]] = row.iloc[2]
print(result)
  • Related