I am desperatly searching for a solution with pandas. Maybe you could help me.
I am looking for a rolling mean with consideration of the previous mean.
df looks like this:
index | count |
---|---|
0 | 4 |
1 | 6 |
2 | 10 |
3 | 12 |
now, using the rolling(window=2).mean()
function I would get something like this:
index | count | r_mean |
---|---|---|
0 | 4 | NaN |
1 | 6 | 5 |
2 | 10 | 8 |
3 | 12 | 11 |
I would like to consider the mean from the first calculation, like this:
index | count | r_mean |
---|---|---|
0 | 4 | NaN |
1 | 6 | 5 |
2 | 10 | 7.5 |
3 | 12 | 9.5 |
where,
row1: (4 6)/2=5
row2: (5 10)/2=7.5
row3: (7.5 12)/2=9.75
thank you in advance!
CodePudding user response:
We can use simple python
loop for this , if you would like speed it up you can try numba
l= []
n = 2
for x,y in zip(df['count'],df.index):
try :
l.append(np.nansum(x l[y-n 1])/n)
except:
l.append(x)
df.loc[n-1:, 'new']=l[n-1:]
df
Out[332]:
index count new
0 0 4 NaN
1 1 6 5.00
2 2 10 7.50
3 3 12 9.75
CodePudding user response:
EDIT: There is actually the method ewm
implemented in pandas that can do this calculation
df['res'] = df['count'].ewm(alpha=0.5, adjust=False, min_periods=2).mean()
Original answer: Here is a way. as everything can be develop with coefficient being power of 2.
# first create a series with power of 2
coef = pd.Series(2**np.arange(len(df)), df.index).clip(lower=2)
df['res'] = (coef.div(2)*df['count']).cumsum()/coef
print(df)
index count res
0 0 4 2.00
1 1 6 5.00
2 2 10 7.50
3 3 12 9.75
You can mask the first value with df.loc[0, 'res'] = np.nan
if needed