Get the top 2 values for each unique value in another column-CodePudding

I have a DataFrame like this:

student       marks     term
steve         55        1
jordan        66        2
steve         53        1
alan          74        2
jordan        99        1
steve         81        2
alan          78        1
alan          76        2
jordan        48        1

I would like to return highest two scores for each student

student  marks    term
steve      81     2
steve      55     1
jordan     99     1
jordan     66     2
alan       78     1
alan       76     2

I have tried

df = df.groupby('student')['marks'].max()

but it returns 1 row, I would like each student in the order they are mentioned with top two scores.

CodePudding user response：

You could use groupby nlargest to find the 2 largest values; then use loc to sort in the order they appear in df:

out = (df.groupby('student')['marks'].nlargest(2)
       .droplevel(1)
       .loc[df['student'].drop_duplicates()]
       .reset_index())

Output:

  student  marks
0   steve     81
1   steve     55
2  jordan     99
3  jordan     66
4    alan     78
5    alan     76

If you want to keep "terms" as well, you could use the index:

idx = df.groupby('student')['marks'].nlargest(2).index.get_level_values(1)
out = df.loc[idx].set_index('student').loc[df['student'].drop_duplicates()].reset_index()

Output:

  student  marks  term
0   steve     81     2
1   steve     55     1
2  jordan     99     1
3  jordan     66     2
4    alan     78     1
5    alan     76     2

@sammywemmy suggested a better way to derive the second result:

out = (df.loc[df.groupby('student', sort=False)['marks'].nlargest(2)
              .index.get_level_values(1)]
       .reset_index(drop=True))

CodePudding user response：

Sorting before grouping should suffice, since you need to keep the term column:

df.sort_values('marks').groupby('student', sort = False).tail(2)
  student  marks  term
0   steve     55     1
1  jordan     66     2
7    alan     76     2
6    alan     78     1
5   steve     81     2
4  jordan     99     1