Home > Net >  sort value per group in already grouped dataframe
sort value per group in already grouped dataframe

Time:09-16

There are many answers to sorting per group when performing groupby operation however in my case I groupby then use aggregated columns to create another column. I want to sort by newly created column per group.

MRE:

df = pd.DataFrame({"A":[1,1,1,3,3,3, 1,1,1,3,3,3],
                   "B":["a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"],
                   "click":[100, 200, 123, 333, 222, 333, 100, 200, 123, 333, 222, 333],
                   "exp":[10000, 10000, 10000, 10000, 10000, 10000, 20000, 20000, 20000, 20000, 20000, 20000]})

grp_df = df.groupby(["A", "B"]).sum()
grp_df["ctr"] = grp_df["click"] / grp_df["exp"] * 100

outputs:

        click   exp     ctr
A   B           
1   a   200     30000   0.666667
    b   400     30000   1.333333
    c   246     30000   0.820000
3   a   666     30000   2.220000
    b   444     30000   1.480000
    c   666     30000   2.220000

Desired output:

        click   exp     ctr
A   B           
1   b   400     30000   1.333333
    c   246     30000   0.820000
    a   200     30000   0.666667
3   a   666     30000   2.220000
    c   666     30000   2.220000
    b   444     30000   1.480000

CodePudding user response:

You can groupby A (level=0) and then sort_values by ctr column:

grp_df.groupby(level=0).apply(
  lambda g: g.sort_values('ctr', ascending=False)
).reset_index(level=0, drop=True)

     click    exp       ctr
A B                        
1 b    400  30000  1.333333
  c    246  30000  0.820000
  a    200  30000  0.666667
3 a    666  30000  2.220000
  c    666  30000  2.220000
  b    444  30000  1.480000

Or as @haneulkim commented, a more concise option would be with group_keys=False:

grp_df.groupby(level=0, group_keys=False).apply(
  lambda g: g.sort_values('ctr', ascending=False))

CodePudding user response:

You could use sort_values, and pair the grouper with the new column:

 grp_df.sort_values(['A', 'ctr'], ascending = [True, False])

     click    exp       ctr
A B
1 b    400  30000  1.333333
  c    246  30000  0.820000
  a    200  30000  0.666667
3 a    666  30000  2.220000
  c    666  30000  2.220000
  b    444  30000  1.480000
  • Related