Home > front end >  Grouping by a values of a column and return max
Grouping by a values of a column and return max

Time:07-24

I have a pandas dataframe with many columns. Now I want to group by values of a column and then return max of another specified column, and also return the values of the remaining columns corresponding to the rows where that max occurred. Below is one example

import pandas as pd
dat = pd.DataFrame({'A' : ['Y', 'Y', 'Z'] , 'X1' : [10, 20, 30], 'X2' : [11, 10, 20], 'X3' : [20, 15, 60]})
dat.groupby('A').agg({'X1' : 'max'})
   X1
A    
Y  20
Z  30

As you see it did not return other columns i.e. X2, X3 etc. I want to get values of those column where the max occurred i.e. {10, 15} and {20, 60}. In column X1, there are no repeated values within a group. In my actual dataframe I have many columns so a general approach would be very helpful

CodePudding user response:

idxmax returns the index where the max appears, so we can make use of that by taking the whole row where X1's max is in each group.

df.loc[df.groupby('A')['X1'].idxmax()]

Output:

   A  X1  X2  X3
1  Y  20  10  15
2  Z  30  20  60
  • Related