I have a dataframe in pandas that looks something like this:
id column_name_1 column_name_2 column_name_3
1 {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value}
2 {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value}
3 {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value}
4 {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value} {acct_nr_1: value, acct_nr_2:value}
Some dictionaries have more than two account numbers, and the values differ. What I want to do is get a dataframe with the same columns plus one. The new column should have the account number (if I only have in the dictionary) and the same if there is more than one account but broken down into several rows. And in each column I want to get only the values.
So the new dataframe should be as follows:
id account_number column_name_1 column_name_2 column_name_3
1 acct_nr_1 value value value
1 acct_nr_2 value value value
2 acct_nr_1 value value value
2 acct_nr_2 value value value
3 acct_nr_1 value value value
3 acct_nr_2 value value value
4 acct_nr_1 value value value
4 acct_nr_2 value value value
I couldn't find a simple (or any) way to solve it.
Thanks!
CodePudding user response:
Stack the df to convert to mutliindex series with two levels, then create a new dataframe from the dictionaries, stack
the new df to convert to multindex series with three levels, now unstack
on level 1
to reshape back into desired format
s = df.set_index('id').stack()
(pd.DataFrame([*s], index=s.index)
.stack()
.unstack(1)
.rename_axis(['id', 'account_number'])
.reset_index())
id account_number column_name_1 column_name_2 column_name_3
0 1 acct_nr_1 value value value
1 1 acct_nr_2 value value value
2 2 acct_nr_1 value value value
3 2 acct_nr_2 value value value
4 3 acct_nr_1 value value value
5 3 acct_nr_2 value value value
6 4 acct_nr_1 value value value
7 4 acct_nr_2 value value value