Why does not Pandas interpolate method works when having the consecutive nan values?-CodePudding

Lets say I have following pandas Series:

s = pd.Series([np.nan, np.nan, np.nan, 0, 1, 2, 3])

and I want to use the pandas nearest interpolate method on this data.

When I run the code

s.interpolate(method='nearest') - it does not do the interpolation.

When I modify the series lets say s = pd.Series([np.nan, 1, np.nan, 0, 1, 2, 3]) then the same method works.

Do you know how to do the interpolation in the first case?

Thanks!

CodePudding user response：

You need two surrounding values to be able to interpolate, else this would be extrapolation.

As you can see with:

s = pd.Series([np.nan, 1, np.nan, 0, 1, 2, 3])
s.interpolate(method='nearest')

only the intermediate NaNs are interpolated:

0    NaN  # cannot interpolate
1    1.0
2    1.0  # interpolated
3    0.0
4    1.0
5    2.0
6    3.0
dtype: float64

As you want the nearest value, a workaround could be to bfill (or ffill):

s.interpolate(method='nearest').bfill()

output:

0    1.0
1    1.0
2    1.0
3    0.0
4    1.0
5    2.0
6    3.0
dtype: float64

follow-up

The only problem occurred when 1. s = pd.Series([np.nan, np.nan, np.nan, 0, np.nan, np.nan, np.nan]) and 2. s = pd.Series([np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]) . In the first case, I want to have 0 everywhere. In the second case, I want to leave it as it is

try:
    s2 = s.interpolate(method='nearest').bfill().ffill()
except ValueError:
    s2 = s.bfill().ffill()