I have some data similar to:
df = pd.DataFrame({'ID':[1, 2, 3, 4, 5],
'1' :[7, 8, 2, 3, 0],
'2' :[1, 0, 4, 9, 9],
'3' :[3, 1, 1, 6, 8]})
ID 1 2 3
0 1 7 1 3
1 2 8 0 1
2 3 2 4 1
3 4 3 9 6
4 5 0 9 8
and I want to add an Av, Sum, Count and Max column for each row for the values in columns 1, 2 and 3. Currently I am doing this like this:
cols = ['1', '2', '3']
df['Av'] = df[cols].mean(axis=1)
df['Sum'] = df[cols].sum(axis=1).fillna(0)
df['Count'] = df[cols].count(axis=1).fillna(0)
df['Max'] = df[cols].max(axis=1).fillna(0)
ID 1 2 3 Av Sum Count Max
0 1 7 1 3 3.666667 11 3 7
1 2 8 0 1 3.000000 9 3 8
2 3 2 4 1 2.333333 7 3 4
3 4 3 9 6 6.000000 18 3 9
4 5 0 9 8 5.666667 17 3 9
I was wondering if there is a way to do this neater/in 1 or 2 lines like with groupby().agg()
?
CodePudding user response:
Use DataFrame.agg
with list of functions:
df = df.join(df[cols].agg(['mean','sum','count','max'], axis=1).fillna(0).convert_dtypes())
print (df)
ID 1 2 3 mean sum count max
0 1 7 1 3 3.666667 11 3 7
1 2 8 0 1 3.0 9 3 8
2 3 2 4 1 2.333333 7 3 4
3 4 3 9 6 6.0 18 3 9
4 5 0 9 8 5.666667 17 3 9
With rename
columns:
d = {'mean':'Av', 'sum':'Sum','count':'Count', 'max':'Max'}
df = (df.join(df[cols].agg(list(d.keys()), axis=1).fillna(0)
.convert_dtypes().rename(columns=d)))
print (df)
ID 1 2 3 Av Sum Count Max
0 1 7 1 3 3.666667 11 3 7
1 2 8 0 1 3.0 9 3 8
2 3 2 4 1 2.333333 7 3 4
3 4 3 9 6 6.0 18 3 9
4 5 0 9 8 5.666667 17 3 9
Another idea:
d = {'mean':'Av', 'sum':'Sum','count':'Count', 'max':'Max'}
df = df.join(pd.concat({v: df[cols].agg(k, axis=1).fillna(0)
for k, v in d.items()}, axis=1))
print (df)
ID 1 2 3 Av Sum Count Max
0 1 7 1 3 3.666667 11 3 7
1 2 8 0 1 3.000000 9 3 8
2 3 2 4 1 2.333333 7 3 4
3 4 3 9 6 6.000000 18 3 9
4 5 0 9 8 5.666667 17 3 9