I have a series in which I want to take the cumulative median of all non-zero values, resulting in a series the same length as the original.
my_series.expanding().median()
gives me a series the same length as my_series
which is close to what I want, but before I take the median of each window I want to drop rows that equal zero from the window, or slice out non-zero values, or something else... whatever performs best.
a = [0, 1, 2, 0, 100, 1000]
my_series = pd.Series(a)
my_series.expanding().median()
# returns:
0 0.0
1 0.5
2 1.0
3 0.5
4 1.0
5 1.5
dtype: float64
# desired output:
# the median is only computed on values in each window that are greater than zero
0 nan
1 1.0
2 1.5
3 1.5
4 2.0
5 51.0
dtype: float64
CodePudding user response:
You can replace 0 values with nan while calculating, so they won't be used in the median calculations.
my_series.replace(0, np.nan).expanding().median()
Output:
0 NaN
1 1.0
2 1.5
3 1.5
4 2.0
5 51.0
dtype: float64