How to create a dictionary from Excel with list as values in pandas?-CodePudding

INPUT- This is an excel sheet

| DatasetName |      Fields |
| ----------- |      ------ |
| Table_1     |     Column1 |
| Table_1     |     Column2 |
| Table_1     |     Column3 |
| Table_1     |     Column4 |
| Table_2     |     Column1 |
| Table_2     |     Column2 |
| Table_2     |     Column3 |
| Table_2     |     Column4 |
| Table_3     |     Column1 |
| Table_3     |     Column2 |
| Table_3     |     Column3 |
| Table_3     |     Column4 |

OUTPUT-

{'Table_1' : ['Column1','Column2','Column3','Column4'] ,
 'Table_2' : ['Column1','Column2','Column3','Column4'] ,
 'Table_3' : ['Column1','Column2','Column3','Column4'] }

CodePudding user response：

Try:

import pandas as pd

df = pd.read_csv('my_file.csv')
# Get the unique dataset names 
names = df['DatasetName'].unique()

result = {}
# Loop over the names, get the corresponding values, store them as a list
for name in names:
    x = df.loc[df['DatasetName'] == name]
    vals = x['Fields'].tolist()
    result[name] = vals

print(result)

CodePudding user response：

Leverage on the powerful Pandas features on data grouping and aggregation:

You can do it easily with just one line with .groupby() .agg() to_dict(), as follows:

import pandas as pd

# read the Excel into dataframe `df`
df = pd.read_excel('my_excel.xlsx')

# generate dict 
dict_out = df.groupby('DatasetName')['Fields'].agg(list).to_dict()

Result:

print(dict_out)

{'Table_1 ': ['Column1', 'Column2', 'Column3', 'Column4'],
 'Table_2': ['Column1', 'Column2', 'Column3', 'Column4'],
 'Table_3': ['Column1', 'Column2', 'Column3', 'Column4']}