Home > Back-end >  Grouping by ID choosing highest values in columns from same ID
Grouping by ID choosing highest values in columns from same ID

Time:11-24

I have a problem trying to calculate some final tests marks. I need to group by Students, getting only the highest value in each column for each student.

Being DF the dataframe:

data = {'Students': ['Student1', 'Student1', 'Student1', 'Student2','Student2','Studen3'], 
        'Result1': [2, 4, 5, 8, 2, 5],
        'Result2': [5, 3, 2, 8, 5, 5],
        'Result3': [7, 5, 7, 3, 8, 9]}
df = pd.DataFrame(data)

    Students    Result1     Result2     Result3
0   Student1    2   5   7
1   Student1    4   3   5
2   Student1    5   2   7
3   Student2    8   8   3
4   Student2    2   5   8
5   Studen3     5   5   9

I need to generate a DF choosing the higher mark, for each student, in each Result.

So, the final DF should look like:

    Students    Result1     Result2     Result3
0   Student1    5   5   7
1   Student2    8   8   8
2   Student3    5   5   9

Any help?

CodePudding user response:

The dataframe can be generated using simply iterations over groups:

df2 = pd.DataFrame(columns=('Student', 'res1', 'res2', 'res3'))

for s in df.Students.unique():
    stdf = df[df["Students"]==s]
    df2 = df2.append({'Student':s,'res1':max(stdf.Result1),'res2':max(stdf.Result2),
                      'res3':max(stdf.Result3)}, ignore_index=True)

CodePudding user response:

 Works calling groupby('Students').max()

>>> df.groupby('Students').max()

           Result1  Result2  Result3
Students
Student1        5        5        7
Student2        8        8        8
Student3        5        5        9
  • Related