While grouping two columns into a new dataframe , how do I create another column in the new datafram-CodePudding

I am currently using the following code to combine Col A and Col B to get Col A & B in DataFrame B below:

out = (df
   .groupby('Col A', group_keys=False, sort=False)
   .apply(lambda d: d.iloc[:, ::-1].unstack().drop_duplicates())
   .reset_index(drop=True).to_frame(name='Col A&B')
)

My question is: how can I create the Col C in DataFrame B that uses the column headers from DataFrame A to label where each value in Col A & B came from?

DataFrame A

Col A	Col B	Col C
1000	100	10
1000	100	20
2000	200	30
2000	200	40

DataFrame B

Col A & B	Col C
10	Col C
20	Col C
100	Col B
1000	Col A
30	Col C
40	Col C
200	Col B
2000	Col A

CodePudding user response：

You might want to use stack drop_duplicates instead of groupby:

df.stack().reset_index(level=1).drop_duplicates().rename(columns={0: 'Col A & B', 'level_1': 'Col C'})

   Col C  Col A & B
0  Col A        100
0  Col B         10
1  Col B         20
2  Col A        200
2  Col B         30
3  Col B         40

CodePudding user response：

You can use groupby and then use stack and sort_values in each group.

df_new = df.groupby('Col A'
                   ).apply(lambda g : g.stack().reset_index().sort_values([0],
                                                                          ascending=True)
                          ).reset_index(drop=True
                                       ).drop('level_0', 
                                              axis=1).rename(columns={0: 'Col A & B', 
                                                                      'level_1': 'Col C'}
                                                            ).drop_duplicates(subset=['Col C', 
                                                                                      'Col A & B'])

print(df_new)

Output:

    Col C  Col A & B
0   Col C         10
1   Col C         20
2   Col B        100
4   Col A       1000
6   Col C         30
7   Col C         40
8   Col B        200
10  Col A       2000