Pandas get normalize using groupby and size and transform-CodePudding

df = pd.DataFrame({'subset_product':['A','A','B','C','C','D'],
                   'success': [1, 0, 1, 1, 0, 0],
                   'count':[5,4,1,1,1,0]})

Above is dataframe. success =1 means success and success = 0 mean failure.. count = number of such event. Need to get success percent for each subset product.

expected output:

A   5/9
B   1
C   1/2
D   0

I am getting the count using below:

a = df.groupby('subset_product')['count'].transform('sum')

But, stuck here with count itself..

CodePudding user response：

IIUC, you can add a helper column and compute the sums per group:

g = df.eval('x=success*count').groupby('subset_product')
out = g['x'].sum()/g['count'].sum()

output:

subset_product
A    0.555556
B    1.000000
C    0.500000
D         NaN
dtype: float64

alternative using `pivot_table`

d = df.pivot_table(index='subset_product', columns='success',
                   values='count', aggfunc='sum')
d[1]/d.sum(1)

alternative using pivot_table

alternative using `pivot_table`