Home > Software design >  Pandas - take maximum/minimum of columns within group
Pandas - take maximum/minimum of columns within group

Time:09-22

I have a Pandas dataframe df with column names par1,par2,..,par7. I want to select a subset of the total dataframe. Within the group par1,par2,par3 I want to select the rows for which par4 has the lowest value and par5 has the highest value in that group. So from the dataframe:

Par1,Par2,Par3,Par4,Par5,Par6,Par7
8,7,0,5,1.5,16.66,20.55
8,7,0,10,1.5,21.64,26.32
8,7,0,15,1.5,26.62,32.58
8,7,0,20,1.5,31.62,37.17
8,7,0,5,3,16.66,27.67
8,7,0,10,3,21.64,36.39
8,7,0,15,3,26.62,46.95
8,7,0,20,3,31.62,54.05

I want to return:

Par1,Par2,Par3,Par4,Par5,Par6,Par7
8,7,0,5,3,16.66,27.67

Because here par3 is 5.0 (lower than 10,15,20) and par4 is 3 (higher than 1.5).

CodePudding user response:

Use DataFrame.sort_values with DataFrame.drop_duplicates:

df = (df.sort_values(['Par1','Par2','Par3', 'Par4', 'Par5'], 
                      ascending=[True, True, True, True, False])
       .drop_duplicates(['Par1','Par2','Par3']))
print (df)
   Par1  Par2  Par3  Par4  Par5   Par6   Par7
4     8     7     0     5   3.0  16.66  27.67
  • Related