Lets say I have following pandas Series:
s = pd.Series([np.nan, np.nan, np.nan, 0, 1, 2, 3])
and I want to use the pandas nearest interpolate method on this data.
When I run the code
s.interpolate(method='nearest')
- it does not do the interpolation.
When I modify the series lets say s = pd.Series([np.nan, 1, np.nan, 0, 1, 2, 3])
then the same method works.
Do you know how to do the interpolation in the first case?
Thanks!
CodePudding user response:
You need two surrounding values to be able to interpolate, else this would be extrapolation.
As you can see with:
s = pd.Series([np.nan, 1, np.nan, 0, 1, 2, 3])
s.interpolate(method='nearest')
only the intermediate NaNs are interpolated:
0 NaN # cannot interpolate
1 1.0
2 1.0 # interpolated
3 0.0
4 1.0
5 2.0
6 3.0
dtype: float64
As you want the nearest value, a workaround could be to bfill
(or ffill
):
s.interpolate(method='nearest').bfill()
output:
0 1.0
1 1.0
2 1.0
3 0.0
4 1.0
5 2.0
6 3.0
dtype: float64
follow-up
The only problem occurred when 1.
s = pd.Series([np.nan, np.nan, np.nan, 0, np.nan, np.nan, np.nan])
and 2.s = pd.Series([np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan])
. In the first case, I want to have 0 everywhere. In the second case, I want to leave it as it is
try:
s2 = s.interpolate(method='nearest').bfill().ffill()
except ValueError:
s2 = s.bfill().ffill()