What can I do to convert a nested dictionary like this example:
nested_dict = { 'Girl': {'June': [45, 32], 'Samantha': [14, 34, 65]},
'Boy': {'Brad': [12, 54, 12], 'Chad': [12]}}
to a dataframe of this format?:
Col1 Col2 Col3
June Girl 45
June Girl 32
Samantha Girl 14
Samantha Girl 34
Samantha Girl 65
Brad Boy 12
Brad Boy 54
Brad Boy 12
Chad Boy 12
failed attempt:
df = pd.DataFrame.from_dict(nested_dict, orient='index')
df
Out[17]:
June Samantha Brad Chad
Girls [45, 32] [14, 34, 65] NaN NaN
Boys NaN NaN [12, 54, 12] [12]
CodePudding user response:
You can use stack
and explode
:
import pandas as pd
nested_dict = { 'Girl': {'June': [45, 32], 'Samantha': [14, 34, 65]},
'Boy': {'Brad': [12, 54, 12], 'Chad': [12]}}
df = pd.DataFrame.from_dict(nested_dict, orient='index')
print(df.stack().explode())
Output:
Girl June 45
June 32
Samantha 14
Samantha 34
Samantha 65
Boy Brad 12
Brad 54
Brad 12
Chad 12
CodePudding user response:
You can't use the option orient="index"
because that converts each dictionary key into a single row, but your original dictionary is more complexly structured.
You have to manually reprocess the dictionary before converting it because the original data you have are normalized - they is nested according to shared data values. The DataFrame object you want to create is explicitly NOT normalized.
Here is some code to de-normalize your data so it can be converted:
nested_dict = {
'Girl': { 'June': [45, 32], 'Samantha': [14, 34, 65] },
'Boy': { 'Brad': [12, 54, 12], 'Chad': [12] }
}
metaDF = { 'Col1' : [], 'Col2' : [], 'Col3' : [] }
for gender in nested_dict:
for person in nested_dict[gender]:
for age in nested_dict[gender][person]:
metaDF['Col1'].append( person )
metaDF['Col2'].append( gender )
metaDF['Col3'].append( age )
df = pd.DataFrame.from_dict( metaDF, orient="columns" )
Reference: