There are many answers to sorting per group when performing groupby
operation however in my case I groupby then use aggregated columns to create another column. I want to sort by newly created column per group.
MRE:
df = pd.DataFrame({"A":[1,1,1,3,3,3, 1,1,1,3,3,3],
"B":["a", "b", "c", "a", "b", "c", "a", "b", "c", "a", "b", "c"],
"click":[100, 200, 123, 333, 222, 333, 100, 200, 123, 333, 222, 333],
"exp":[10000, 10000, 10000, 10000, 10000, 10000, 20000, 20000, 20000, 20000, 20000, 20000]})
grp_df = df.groupby(["A", "B"]).sum()
grp_df["ctr"] = grp_df["click"] / grp_df["exp"] * 100
outputs:
click exp ctr
A B
1 a 200 30000 0.666667
b 400 30000 1.333333
c 246 30000 0.820000
3 a 666 30000 2.220000
b 444 30000 1.480000
c 666 30000 2.220000
Desired output:
click exp ctr
A B
1 b 400 30000 1.333333
c 246 30000 0.820000
a 200 30000 0.666667
3 a 666 30000 2.220000
c 666 30000 2.220000
b 444 30000 1.480000
CodePudding user response:
You can groupby A
(level=0) and then sort_values
by ctr
column:
grp_df.groupby(level=0).apply(
lambda g: g.sort_values('ctr', ascending=False)
).reset_index(level=0, drop=True)
click exp ctr
A B
1 b 400 30000 1.333333
c 246 30000 0.820000
a 200 30000 0.666667
3 a 666 30000 2.220000
c 666 30000 2.220000
b 444 30000 1.480000
Or as @haneulkim commented, a more concise option would be with group_keys=False
:
grp_df.groupby(level=0, group_keys=False).apply(
lambda g: g.sort_values('ctr', ascending=False))
CodePudding user response:
You could use sort_values
, and pair the grouper with the new column:
grp_df.sort_values(['A', 'ctr'], ascending = [True, False])
click exp ctr
A B
1 b 400 30000 1.333333
c 246 30000 0.820000
a 200 30000 0.666667
3 a 666 30000 2.220000
c 666 30000 2.220000
b 444 30000 1.480000