I have a spark dataframe that looks like this
df =
Name Score Section
W 26 A
M 62 A
Q 69 A
Y 86 A
J 16 B
A 83 B
I want create a new dataframe that contains a single row (the row with the max score) so it will look like this
dataframe_maximum =
Name Score Section
Y 86 A
I know I can use groupby and agg max to achieve this I tried something like this but I don't think I quite have it correct
dataframe_max = df.groupBy(['Name','Score','Section']).agg(
max('Score')
CodePudding user response:
df.sort("Score",ascending=False).take(1) Although, doing a sort is a wide operation so it might not be efficient