Home > Enterprise >  Finding max row after groupby in pandas dataframe
Finding max row after groupby in pandas dataframe

Time:11-10

I have a daframe as follows:

Month   Col1    Col2    Val
A       p       a1      31
A       q       a1      78
A       r       b2      13
B       x       a1      54
B       y       b2      56
B       z       b2      65

I want to get the following:

Month   a1  b2
A       q   r
B       x   z

Essentially for each pair of Month and Col2, I want to find the value in Col1 which is has the maximum value.

I am not sure how to approach this.

CodePudding user response:

Your problem is:

  1. Find row with max Val within a group, which is sort and drop_duplicates, and
  2. transform the data, which is pivot:
(df.sort_values('Val') 
   .drop_duplicates(['Month','Col2'], keep='last')
   .pivot(index='Month', columns='Col2', values='Col1')
)

Output:

Col2  a1 b2
Month      
A      q  r
B      x  z
  • Related