I have this a dataframe like this:
col1 | col2 | col3 |
---|---|---|
US | 1 | 1 |
US | 1 | 2 |
US | 2 | 1 |
NL | 1 | 1 |
US | 2 | 2 |
DK | 1 | 1 |
and I would like to get a dictionary of dictionaries of lists grouped by col1, like that:
dict = {US: {1: [1, 2], 2: [1, 2]}, NL: {1: [1]}, DK: {1: [1]}}
Basically, each unique element of col1 should correspond to a nested dictionary that has the unique element of col2 as key and all the elements of col3 as values.
I tried
dct = df.groupby("col1").apply(lambda x: x.set_index("col2")['col2'].to_dict()).to_dict()
but I do not get the expected outcome.
Any suggestions?
CodePudding user response:
Here's one option using a nested groupby
:
out = df.groupby('col1').apply(lambda g: g.groupby('col2')['col3'].agg(list).to_dict()).to_dict()
Output:
{'DK': {1: [1]}, 'NL': {1: [1]}, 'US': {1: [1, 2], 2: [1, 2]}}
CodePudding user response:
itertuples
d = {}
for a, b, c in df.itertuples(index=False, name=None):
d.setdefault(a, {}).setdefault(b, []).append(c)
d
{'US': {1: [1, 2], 2: [1, 2]}, 'NL': {1: [1]}, 'DK': {1: [1]}}
Same thing but using map
and zip
d = {}
for a, b, c in zip(*map(df.get, ['col1', 'col2', 'col3'])):
d.setdefault(a, {}).setdefault(b, []).append(c)
d
{'US': {1: [1, 2], 2: [1, 2]}, 'NL': {1: [1]}, 'DK': {1: [1]}}
Pandas variants
I don't think these are as good as the method above
1
d = df.groupby(['col1', 'col2'])['col3'].agg(list)
{a: d.xs(a).to_dict() for a in d.index.levels[0]}
{'DK': {1: [1]}, 'NL': {1: [1]}, 'US': {1: [1, 2], 2: [1, 2]}}
2
{
a: b.xs(a).to_dict()
for a, b in df.groupby(['col1', 'col2'])['col3'].agg(list).groupby('col1')
}
{'DK': {1: [1]}, 'NL': {1: [1]}, 'US': {1: [1, 2], 2: [1, 2]}}