Create dictionary from 2 columns of Dataframe-CodePudding

I have a dataframe:

df = pd.DataFrame({ 
'ID': ['1', '4', '4', '3', '3', '3'], 
'club': ['arts', 'math', 'theatre', 'poetry', 'dance', 'cricket']
})

Note: Both the columns of the data frame can have repeated values.

I want to create a dictionary of dictionaries for every rank with its unique club names. It should look like this:

{
{'1':'arts'}, {'4':'math','theatre'}, {'3':'poetry','dance','cricket'}
}

Kindly help me with this

CodePudding user response：

Try groupby() and then to_dict():

grouped = df.groupby("ID")["club"].apply(set)
print(grouped)
> ID
   1                      {arts}
   3    {cricket, poetry, dance}
   4             {math, theatre}

grouped_dict = grouped.to_dict()
print(grouped_dict)
> {'1': {'arts'}, '3': {'cricket', 'poetry', 'dance'}, '4': {'math', 'theatre'}}

Edit:

Changed to .apply(set) to get sets.

CodePudding user response：

You can use a defaultdict:

from collections import defaultdict
d = defaultdict(set)
for k,v in zip(df['ID'], df['club']):
    d[k].add(v)
dict(d)

output:

{'1': {'arts'}, '4': {'math', 'theatre'}, '3': {'cricket', 'dance', 'poetry'}}

or for a format similar to the provided output:

[{k:v} for k,v in d.items()]

output:

[{'1': {'arts'}},
 {'4': {'math', 'theatre'}},
 {'3': {'cricket', 'dance', 'poetry'}}]