Home > Enterprise >  median in pandas dropping center value
median in pandas dropping center value

Time:04-28

I am working in pandas and want to implement an algorithm that requires I assess a modified centered median on a window, but omitting the middle value. So for instance the unmodified might be:

ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
med = ser.rolling(5,center=True).median()
print(med)

and I would like the result for med[3] to be 3.5 (the average of 2. and 5.) rather than 4.5. Is there an economical way to do this?

CodePudding user response:

Try:

import numpy as np
import pandas as pd
ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
med = ser.rolling(5).apply(lambda x: np.median(np.concatenate([x[0:2],x[3:5]]))).shift(-2)
print(med)

With output:

0     NaN
1     NaN
2    2.75
3    3.50
4    5.25
5    6.50
6     NaN
7     NaN

And more generally:

rolling_size = 5
ser.rolling(rolling_size).apply(lambda x: np.median(np.concatenate([x[0:int(rolling_size/2)],x[int(rolling_size/2) 1:rolling_size]]))).shift(-int(rolling_size/2))

CodePudding user response:

ser = pd.Series(data=[0.,1.,2.,4.5,5.,6.,8.,9])
def median(series, window = 2):
    df = pd.DataFrame(series[window:].reset_index(drop=True))
    df[1] = series[:-window]
    df = df.apply(lambda x: x.mean(), axis=1)
    df.index  = window - 1
    return df
median(ser)

I think it is simpler

  • Related