I am working in pandas and want to implement an algorithm that requires I assess a modified centered median on a window, but omitting the middle value. So for instance the unmodified might be:
ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
med = ser.rolling(5,center=True).median()
print(med)
and I would like the result for med[3] to be 3.5 (the average of 2. and 5.) rather than 4.5. Is there an economical way to do this?
CodePudding user response:
Try:
import numpy as np
import pandas as pd
ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
med = ser.rolling(5).apply(lambda x: np.median(np.concatenate([x[0:2],x[3:5]]))).shift(-2)
print(med)
With output:
0 NaN
1 NaN
2 2.75
3 3.50
4 5.25
5 6.50
6 NaN
7 NaN
And more generally:
rolling_size = 5
ser.rolling(rolling_size).apply(lambda x: np.median(np.concatenate([x[0:int(rolling_size/2)],x[int(rolling_size/2) 1:rolling_size]]))).shift(-int(rolling_size/2))
CodePudding user response:
ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
def median(series, window = 2):
df = pd.DataFrame(series[window:].reset_index(drop=True))
df[1] = series[:-window]
df = df.apply(lambda x: x.mean(), axis=1)
df.index = window - 1
return df
median(ser)
I think it is simpler