Home > database >  Divide 'count' and 'sum' inside agg function in pandas
Divide 'count' and 'sum' inside agg function in pandas

Time:10-20

Using groupby and agg, is it possible to, in the same expression, get the 'count' divided by 'sum' returned?

data=pd.DataFrame({'id':[1,1,1,2,2,3,3,3,3,4, 4],
             'val':[10,11,10,13,12,15,12,9, 8, 14, 2]})

data.groupby(['id'])['val'].agg(['count','sum'])

CodePudding user response:

Assuming this is a dummy example (else just compute the mean), yes it is possible to combine aggregators using apply:

out = data.groupby(['id'])['val'].apply(lambda g: g.count()/g.sum())

Better alternative in my opinion:

g = data.groupby(['id'])['val']
out = g.count()/g.sum()

# as a one-liner in python ≥3.8
# out = (g:=data.groupby(['id'])['val']).count().div(g.sum())

output:

id
1    0.096774
2    0.080000
3    0.090909
4    0.125000
Name: val, dtype: float64

CodePudding user response:

Use lambda function:

s = data.groupby(['id'])['val'].agg(lambda g: g.count()/g.sum())
print (s)
id
1    0.096774
2    0.080000
3    0.090909
4    0.125000
Name: val, dtype: float64

Or use DataFrame.eval, with rename count and sum columns to a, b:

s = data.groupby(['id'])['val'].agg([('a','count'),('b','sum')]).eval('a / b')
print (s)

id
1    0.096774
2    0.080000
3    0.090909
4    0.125000
dtype: float64
  • Related