beginner here I got a dataframe like this
My question is how I can get the latest overall score of each student_id?. I have attempted the question with groupby, but I did not get the desire results. Please help me, thank you.
CodePudding user response:
If df
is your dataframe:
df.drop_duplicates(subset='student_id', keep="last")
CodePudding user response:
I would group by student_id, then pick the latest entry by date.
for student, group in df.groupby('student_id'):
last_grade = group.loc[group.date == group.date.max()]['overall score']
print(student, last_grade)
CodePudding user response:
The lastest you would have to return it by Date, assuming df is your dataframe:
df["date"].max()