Home > other >  Pandas groupby cumulative sum,mean,median based on the flag values
Pandas groupby cumulative sum,mean,median based on the flag values

Time:10-26

I need to calculate cumulative calculations (sum,mean,median,etc) based on the values of Indicator

It should do the calculations for all the false indicator and print it adjacent to the true. then calculated value should reset and start from true until true appears again for Indicator.

Input data frame

Amount Indicator
10 False
20 False
5 True
8 False
4 False
6 True

output would be

Amount Indicator Sum Mean
10 False
20 False
5 True 30 15
8 False
4 False
6 True 17 5.6

I tried to use groupby with cumsum() using Indicator as group , but have not had any luck yet.

CodePudding user response:

In one go using only one groupby:

df[["Sum", "Mean"]] = np.NaN
df.loc[df.Indicator, ["Sum", "Mean"]] = (df.groupby(df.Indicator.cumsum())
                                         .agg({"Amount": ["sum", "mean"]})
                                         .shift(1).iloc[1:].values)

CodePudding user response:

Let's use cumsum of Indicator column as group key to get cumsum and size of Amount column

m = df['Indicator'].cumsum()

df['Sum'] = df.groupby(m)['Amount'].cumsum().shift().where(df['Indicator'])
df['Mean'] = df['Sum'].shift(-1).div(df.groupby(m)['Amount'].transform('size')).shift()
print(df)

   Amount  Indicator   Sum       Mean
0      10      False   NaN        NaN
1      20      False   NaN        NaN
2       5       True  30.0  15.000000
3       8      False   NaN        NaN
4       4      False   NaN        NaN
5       6       True  17.0   5.666667
  • Related