Home > Enterprise >  Split nested list with dictionary into columns based on key value
Split nested list with dictionary into columns based on key value

Time:04-13

I can't figure out how to create columns based in the first item of a nested dictionary of a column and then place the value based on the second value of the dictionary.

Reproducible example

import pandas as pd
df = pd.DataFrame({'ID':[1,2,3,4], 'nestedlist':[[{'code':1,'value':10},{'code':2,'value':20}],
                                            [{'code':2,'value':10},{'code':3,'value':20}],
                                            [{'code':1,'value':10},{'code':2,'value':20}],
                                            [{'code':1,'value':10},{'code':3,'value':20}]]})

I want to achieve something like this:

# desired DataFrame
   ID code1 code2 code3
0  1  10    20    NaN
1  2  NaN   10    20
2  3  10    20    NaN
3  4  10    NaN   20

CodePudding user response:

You can explode on nestedlist columns and convert dict column to multi columns with pd.Series. Then pivot table.

df_ = df.explode('nestedlist')
df_ = pd.concat([df_['ID'], df_['nestedlist'].apply(pd.Series)], axis=1)
df_ = df_.pivot(index='ID', columns='code', values='value').add_prefix('code').reset_index()
print(df_)

code  ID  code1  code2  code3
0      1   10.0   20.0    NaN
1      2    NaN   10.0   20.0
2      3   10.0   20.0    NaN
3      4   10.0    NaN   20.0

CodePudding user response:

You could explode the column and pivot it; then join it back to df:

exploded = df['nestedlist'].explode()
out = (df[['ID']].join(pd.pivot(pd.DataFrame(exploded.tolist(), index=exploded.index)
                                .reset_index(), 'index','code','value')
                       .add_prefix('code')))

Output:

   ID  code1  code2  code3
0   1   10.0   20.0    NaN
1   2    NaN   10.0   20.0
2   3   10.0   20.0    NaN
3   4   10.0    NaN   20.0
  • Related