How do I perform the following dataframe operation going from Dataframe A to dataframe B in pandas for python? I have tried pivot and groupby but I keep getting errors. Any support is greatly appreciated.
DataFrame A
Col A | Col B |
---|---|
100 | 1 |
100 | 2 |
200 | 3 |
200 | 4 |
DataFrame B
Col A & B |
---|
1 |
2 |
100 |
3 |
4 |
200 |
CodePudding user response:
One option using groupby
:
out = (df
.groupby('Col A', group_keys=False, sort=False)
.apply(lambda d: d.iloc[:, ::-1].unstack().drop_duplicates())
.reset_index(drop=True).to_frame(name='Col A&B')
)
Another with concat
:
out = (pd
.concat([df['Col B'], df['Col A'].drop_duplicates(keep='last')])
.sort_index().reset_index(drop=True).to_frame(name='Col A&B')
)
output:
Col A&B
0 1
1 2
2 100
3 3
4 4
5 200
If order does not matter, you can stack
:
out = df.stack().drop_duplicates().reset_index(drop=True).to_frame(name='Col A&B')
output:
Col A&B
0 100
1 1
2 2
3 200
4 3
5 4
CodePudding user response:
Another possible solution:
out = pd.DataFrame({'Col A&B': np.unique(df)})
out
Output:
Col A&B
0 1
1 2
2 3
3 4
4 100
5 200