I have data with 1000 rows. Example structure:
i need to find:
determine the winners (surnames and names of children) in each age group for boys
and for girls (3 first places) for each species. Keep in mind that the winners can be
more than 3, as the results may coincide
I tried to do a double grouping by date and run, but I need to have 3 positions in one year, and there are infinitely many athletes in each run time position
one example of my code:
df_agg_girl = data[data['sex']=='ж'].groupby([data.year,data.run1000,data.fullName]).agg("mean")
g = df_agg_girl['run1000'].groupby(['year','run1000'], group_keys=False)
res = g.apply(lambda x: x.sort_values(ascending=False).nsmallest(3,keep='all'))
print(res)
I really don't understand how to do it
CodePudding user response:
Can you provide more information about the columns? Please try to share the real format. With the data that I see I could use this.
df["ranking"] = df.groupby(["year",'sex'])["run"].rank(method="dense", ascending=True)
print(df[df.ranking <= 3])