I can't figure out how to create columns based in the first item of a nested dictionary of a column and then place the value based on the second value of the dictionary.
Reproducible example
import pandas as pd
df = pd.DataFrame({'ID':[1,2,3,4], 'nestedlist':[[{'code':1,'value':10},{'code':2,'value':20}],
[{'code':2,'value':10},{'code':3,'value':20}],
[{'code':1,'value':10},{'code':2,'value':20}],
[{'code':1,'value':10},{'code':3,'value':20}]]})
I want to achieve something like this:
# desired DataFrame
ID code1 code2 code3
0 1 10 20 NaN
1 2 NaN 10 20
2 3 10 20 NaN
3 4 10 NaN 20
CodePudding user response:
You can explode
on nestedlist
columns and convert dict column to multi columns with pd.Series
. Then pivot
table.
df_ = df.explode('nestedlist')
df_ = pd.concat([df_['ID'], df_['nestedlist'].apply(pd.Series)], axis=1)
df_ = df_.pivot(index='ID', columns='code', values='value').add_prefix('code').reset_index()
print(df_)
code ID code1 code2 code3
0 1 10.0 20.0 NaN
1 2 NaN 10.0 20.0
2 3 10.0 20.0 NaN
3 4 10.0 NaN 20.0
CodePudding user response:
You could explode
the column and pivot
it; then join
it back to df
:
exploded = df['nestedlist'].explode()
out = (df[['ID']].join(pd.pivot(pd.DataFrame(exploded.tolist(), index=exploded.index)
.reset_index(), 'index','code','value')
.add_prefix('code')))
Output:
ID code1 code2 code3
0 1 10.0 20.0 NaN
1 2 NaN 10.0 20.0
2 3 10.0 20.0 NaN
3 4 10.0 NaN 20.0