Calculate the rolling mean of every n-th element over an m-element window in python-CodePudding

Suppose I have a vector like so:

s = pd.Series(range(50))

The rolling sum over, let's say a 2-element window is easily calculated:

s.rolling(window=2, min_periods=2).mean()

Now I don't want to take the adjacent 2 elements for the window, but I want to take e.g. every third element. Still only take the last 2 of them. It would result in this vector:

0    NaN 
1    NaN
2    NaN
3    1.5 -- (3 0)/2
4    2.5 -- (4 1)/2
5    3.5 -- (5 2)/2
6    4.5 -- ...
7    5.5
8    6.5
9    7.5
...

How can I achieve this efficiently?

Thanks!

CodePudding user response：

use stride parameter in the numpy.ndarray.strides attribute, which allows you to specify the number of bytes to step in each dimension when traversing an array.

import numpy as np
arr = np.arange(10)
strided = np.lib.stride_tricks.as_strided(arr, shape=(len(arr)//3, 3), strides=(3*arr.itemsize, arr.itemsize))
result = np.mean(strided[:, -2:], axis=1)

output:

array([1.5, 2.5, 3.5, 4.5, 5.5, 6.5, 7.5])

CodePudding user response：

This is not directly possible with rolling.

A workaround would be:

out = s.add(s.shift(3)).div(2)

Otherwise you need to use the underlying numpy array (see @John's answer)