Home > OS >  Pandas get normalize using groupby and size and transform
Pandas get normalize using groupby and size and transform

Time:03-19

df = pd.DataFrame({'subset_product':['A','A','B','C','C','D'],
                   'success': [1, 0, 1, 1, 0, 0],
                   'count':[5,4,1,1,1,0]})

Above is dataframe. success =1 means success and success = 0 mean failure.. count = number of such event. Need to get success percent for each subset product.

expected output:

A   5/9
B   1
C   1/2
D   0

I am getting the count using below:

a = df.groupby('subset_product')['count'].transform('sum')

But, stuck here with count itself..

CodePudding user response:

IIUC, you can add a helper column and compute the sums per group:

g = df.eval('x=success*count').groupby('subset_product')
out = g['x'].sum()/g['count'].sum()

output:

subset_product
A    0.555556
B    1.000000
C    0.500000
D         NaN
dtype: float64
alternative using pivot_table
d = df.pivot_table(index='subset_product', columns='success',
                   values='count', aggfunc='sum')
d[1]/d.sum(1)
  • Related