I have a daframe as follows:
Month Col1 Col2 Val
A p a1 31
A q a1 78
A r b2 13
B x a1 54
B y b2 56
B z b2 65
I want to get the following:
Month a1 b2
A q r
B x z
Essentially for each pair of Month
and Col2
, I want to find the value in Col1
which is has the maximum value.
I am not sure how to approach this.
CodePudding user response:
Your problem is:
- Find row with max
Val
within a group, which issort
anddrop_duplicates
, and - transform the data, which is
pivot
:
(df.sort_values('Val')
.drop_duplicates(['Month','Col2'], keep='last')
.pivot(index='Month', columns='Col2', values='Col1')
)
Output:
Col2 a1 b2
Month
A q r
B x z