Home > Software engineering >  Why does not Pandas interpolate method works when having the consecutive nan values?
Why does not Pandas interpolate method works when having the consecutive nan values?

Time:09-28

Lets say I have following pandas Series:

s = pd.Series([np.nan, np.nan, np.nan, 0, 1, 2, 3])

and I want to use the pandas nearest interpolate method on this data.

When I run the code

s.interpolate(method='nearest') - it does not do the interpolation.

When I modify the series lets say s = pd.Series([np.nan, 1, np.nan, 0, 1, 2, 3]) then the same method works.

Do you know how to do the interpolation in the first case?

Thanks!

CodePudding user response:

You need two surrounding values to be able to interpolate, else this would be extrapolation.

As you can see with:

s = pd.Series([np.nan, 1, np.nan, 0, 1, 2, 3])
s.interpolate(method='nearest')

only the intermediate NaNs are interpolated:

0    NaN  # cannot interpolate
1    1.0
2    1.0  # interpolated
3    0.0
4    1.0
5    2.0
6    3.0
dtype: float64

As you want the nearest value, a workaround could be to bfill (or ffill):

s.interpolate(method='nearest').bfill()

output:

0    1.0
1    1.0
2    1.0
3    0.0
4    1.0
5    2.0
6    3.0
dtype: float64

follow-up

The only problem occurred when 1. s = pd.Series([np.nan, np.nan, np.nan, 0, np.nan, np.nan, np.nan]) and 2. s = pd.Series([np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]) . In the first case, I want to have 0 everywhere. In the second case, I want to leave it as it is

try:
    s2 = s.interpolate(method='nearest').bfill().ffill()
except ValueError:
    s2 = s.bfill().ffill()
  • Related