Consider the following list
import numpy as np
import pandas as pd
l = [1,4,6,np.NaN,20,np.Nan,24]
I know I can replace the nan values using simple linear interpolation using pandas
interpolate
as follows
pd.Series([1,4,6,np.NaN,20,np.NaN,24]).interpolate()
Out[38]:
0 1.0
1 4.0
2 6.0
3 13.0
4 20.0
5 22.0
6 24.0
dtype: float64
My question is: how can I get the same result by only using list comprehensions, standard numpy functions, but no built-in interpolation function (pd.interpolate()
or numpy.interp()`)? That is, using directly the formula for linear interpolation between two points.
CodePudding user response:
l = [1,4,6,np.nan,20,np.nan,24]
res = [l[i] if not np.isnan(l[i]) else (l[i-1] l[i 1])/2 for i in range(len(l))]
print(res)
CodePudding user response:
Not sure if it is really a fit for this question since it is not just a list comprehension, but I've figured out a solution that works for gaps with more than 1 consecutive nan
:
import numpy as np
l = [1,4,6,np.nan,20,np.nan,24, 30, 31, np.nan, np.nan, 70, 75]
# 1 -> entry is nan
nans = np.isnan(l)
# 1 -> from number to nan, -1 -> from nan to number
diffs = np.diff(list(map(int, nans)))
# get "gap of nans" begin and end indices
gap_starts = np.where(diffs == 1)[0]
gap_ends = np.where(diffs == -1)[0]
for begin, end in zip(gap_starts, gap_ends):
# number of nans in the gap
nans_n = end - begin
# difference of gap extrema
nan_diff = abs(l[begin] - l[end 1])
# step to add at each nan
step = round(nan_diff / (nans_n 1))
# interpolate section from begin to end
filling = [l[begin] (step * n) for n in range(1, nans_n 1)]
# fix l with interpolated values
l[begin 1:end 1] = filling
print(l)
produces
[1, 4, 6, 13, 20, 22, 24, 30, 31, 44, 57, 70, 75]