Home > Blockchain >  Convert Pandas DataFrame to dictionary where (grouped) columns are keys and (column-wise) rows are v
Convert Pandas DataFrame to dictionary where (grouped) columns are keys and (column-wise) rows are v

Time:11-15

I wish to convert a DataFrame into a dictionary where columns are the keys and (column-wise) rows are its values. I also need to use grouping when doing so.

        team  id    name  salary
0      Alpha  10    Jack    1000
1      Alpha  15    John    2000
2      Alpha  20    John    2000
3      Bravo  50  Thomas    5000
4      Bravo  55  Robert    6000
5      Bravo  60  Albert    7000

Expected output:

# expected output WITH duplicates
ex = {'Alpha': {'id': [10, 15, 20], 'name': ['Jack', 'John', 'John'], 'salary': [1000, 2000, 2000]},
      'Bravo': {'id': [50, 55, 60], 'name': ['Thomas', 'Robert', 'Albert'], 'salary': [5000, 6000, 7000]}
      }

# expected output WITHOUT duplicates ('name', 'salary')
ex = {'Alpha': {'id': [10, 15, 20], 'name': ['Jack', 'John'], 'salary': [1000, 2000]},
      'Bravo': {'id': [50, 55, 60], 'name': ['Thomas', 'Robert', 'Albert'], 'salary': [5000, 6000, 7000]}
      }

Can it be done somehow using df.to_dict() ?

Code for example:

import pandas as pd

d = {'team': ['Alpha', 'Alpha', 'Alpha', 'Bravo', 'Bravo', 'Bravo'],
     'id': [10, 15, 20, 50, 55, 60],
     'name': ['Jack', 'John', 'John', 'Thomas', 'Robert', 'Albert'],
     'salary': [1000, 2000, 2000, 5000, 6000, 7000]}

df = pd.DataFrame(data=d)

CodePudding user response:

A Groupby then to_dict should do the trick:

out = df.groupby('team').agg(list).to_dict('index')
print(out)

Output:

{'Alpha': {'id': [10, 15, 20],
           'name': ['Jack', 'John', 'John'],
           'salary': [1000, 2000, 2000]},
 'Bravo': {'id': [50, 55, 60],
           'name': ['Thomas', 'Robert', 'Albert'],
           'salary': [5000, 6000, 7000]}}

For unique lists:

out = df.groupby('team').agg(lambda x: x.unique().tolist()).to_dict('index')
print(out)

# Output:

{'Alpha': {'id': [10, 15, 20],
           'name': ['Jack', 'John'],
           'salary': [1000, 2000]},
 'Bravo': {'id': [50, 55, 60],
           'name': ['Thomas', 'Robert', 'Albert'],
           'salary': [5000, 6000, 7000]}}

CodePudding user response:

expected output WITH duplicates

df.groupby('team').agg(list).T.to_dict()

output:

{'Alpha': {'id': [10, 15, 20],
           'name': ['Jack', 'John', 'John'],
           'salary': [1000, 2000, 2000]},
 'Bravo': {'id': [50, 55, 60],
           'name': ['Thomas', 'Robert', 'Albert'],
           'salary': [5000, 6000, 7000]}}



expected output WITHOUT duplicates

df.groupby('team').agg(lambda x: list(set(x))).T.to_dict()

output:

{'Alpha': {'id': [10, 20, 15],
  'name': ['Jack', 'John'],
  'salary': [1000, 2000]},
 'Bravo': {'id': [50, 60, 55],
  'name': ['Thomas', 'Albert', 'Robert'],
  'salary': [5000, 7000, 6000]}}
  • Related