let say I have a pandas data frame and already grouped as
grp=df.groupby(['a','b' ]).sum()
now I would like to calculate for every group a
, the percentage of b
for each column ,
for example: P1, aaaa = 11/484
, P1, aaac = 8/357
, N1, aaaa = 61/7183
so on ....
Reproducible grouped data
pd.DataFrame({'aaaa': {('P 1', 0): 484,('P 1', 1): 11,}})
CodePudding user response:
You can do:
grp.loc[(slice(None), 1),:].droplevel(1)/grp.loc[(slice(None), 0),:].droplevel(1)
In practice whith grp.loc[(slice(None), 1),:]
and grp.loc[(slice(None), 0),:]
I extract only the rows with b==1
and b==0
(try yourself and see the output); after that I need to remove the b
level (.droplevel(1)
) to make these two objects have the same index (the columns are already shared); finally I divided this two matrices with /
(now I can do it because now they have same index and columns). Hope it is clear :)
CodePudding user response:
You can use xs
to a select particular level of a MultiIndex:
out = df.xs(1, level=1) / df.xs(0, level=1)
Output:
aaaa
P 1 0.022727