Home > other >  How to unpack dictionaries in rows pandas dataframe
How to unpack dictionaries in rows pandas dataframe

Time:06-03

I have a dataframe in pandas that looks something like this:

id   column_name_1                         column_name_2                          column_name_3        
1    {acct_nr_1: value, acct_nr_2:value}   {acct_nr_1: value, acct_nr_2:value}    {acct_nr_1: value, acct_nr_2:value}    
2    {acct_nr_1: value, acct_nr_2:value}   {acct_nr_1: value, acct_nr_2:value}    {acct_nr_1: value, acct_nr_2:value}       
3    {acct_nr_1: value, acct_nr_2:value}   {acct_nr_1: value, acct_nr_2:value}    {acct_nr_1: value, acct_nr_2:value}       
4    {acct_nr_1: value, acct_nr_2:value}   {acct_nr_1: value, acct_nr_2:value}    {acct_nr_1: value, acct_nr_2:value} 

Some dictionaries have more than two account numbers, and the values differ. What I want to do is get a dataframe with the same columns plus one. The new column should have the account number (if I only have in the dictionary) and the same if there is more than one account but broken down into several rows. And in each column I want to get only the values.

So the new dataframe should be as follows:

id   account_number        column_name_1   column_name_2   column_name_3        
1    acct_nr_1             value           value           value
1    acct_nr_2             value           value           value
2    acct_nr_1             value           value           value          
2    acct_nr_2             value           value           value
3    acct_nr_1             value           value           value    
3    acct_nr_2             value           value           value
4    acct_nr_1             value           value           value
4    acct_nr_2             value           value           value

I couldn't find a simple (or any) way to solve it.

Thanks!

CodePudding user response:

Stack the df to convert to mutliindex series with two levels, then create a new dataframe from the dictionaries, stack the new df to convert to multindex series with three levels, now unstack on level 1 to reshape back into desired format

s = df.set_index('id').stack()

(pd.DataFrame([*s], index=s.index)
  .stack()
  .unstack(1)
  .rename_axis(['id', 'account_number'])
  .reset_index())

   id account_number column_name_1 column_name_2 column_name_3
0   1      acct_nr_1         value         value         value
1   1      acct_nr_2         value         value         value
2   2      acct_nr_1         value         value         value
3   2      acct_nr_2         value         value         value
4   3      acct_nr_1         value         value         value
5   3      acct_nr_2         value         value         value
6   4      acct_nr_1         value         value         value
7   4      acct_nr_2         value         value         value
  • Related