df = pd.DataFrame({'subset_product':['A','A','B','C','C','D'],
'success': [1, 0, 1, 1, 0, 0],
'count':[5,4,1,1,1,0]})
Above is dataframe. success =1 means success and success = 0 mean failure.. count = number of such event. Need to get success percent for each subset product.
expected output:
A 5/9
B 1
C 1/2
D 0
I am getting the count using below:
a = df.groupby('subset_product')['count'].transform('sum')
But, stuck here with count itself..
CodePudding user response:
IIUC, you can add a helper column and compute the sums per group:
g = df.eval('x=success*count').groupby('subset_product')
out = g['x'].sum()/g['count'].sum()
output:
subset_product
A 0.555556
B 1.000000
C 0.500000
D NaN
dtype: float64
alternative using pivot_table
d = df.pivot_table(index='subset_product', columns='success',
values='count', aggfunc='sum')
d[1]/d.sum(1)