I have two lists of y values:
y_list1 = [45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan]
y_list2 = [4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]
and both of these values were obtained at a set of time points:
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
The aim: Return y_list1 and y_list2 with the np.nans replaced with values, by fitting a polynomial regression to the data that is there, and then calculating the missing points.
I am able to fit the polynomial:
import sys
import numpy as np
x = np.array([0,3,4,5,6,7,8,9,10,11,12,13,14,15])
id_list = ['1','2']
list_y = np.array([[45,np.nan,np.nan,np.nan, 40,50,6,2,7,np.nan, np.nan,np.nan, np.nan, np.nan],[4,23,np.nan, np.nan, np.nan, np.nan, np.nan,5, np.nan, np.nan, np.nan, np.nan, np.nan]]
for each_id,y in zip(id_list,list_y):
#treat the missing data
idx = np.isfinite(x) & np.isfinite(y)
#fit
ab = np.polyfit(x[idx], y[idx], len(list_y[0]))
So then I wanted to use this fit to replace the missing values in y, so I found this, and implemented:
replace_nan = np.polyval(x,y)
print(replace_nan)
The output is:
[2.13161598e 20 nan nan nan
5.20634185e 19 7.52453405e 20 8.35884417e 09 3.27510000e 04
5.11358666e 10 nan nan nan
nan nan]
test_polyreg.py:16: RankWarning: Polyfit may be poorly conditioned
ab = np.polyfit(x[idx], y[idx], len(list_y[0])) #understand how many degrees
[7.45653990e 07 6.97736286e 16 nan nan
nan nan nan 9.91821285e 08
nan nan nan nan
nan nan]
I'm not concerned about the poor conditioning warning because this is just test data to try understand how it should work, but the output still has nans in it (and didn't use the fit I'd previously generated), could someone should be how to replace the nans in the y values with points estimated from a polynomial regression?
CodePudding user response:
first you should modify the ab
definition as:
ab = np.polyfit(x[idx], np.array(y)[idx], idx.sum())
ab
are your polynomial coefficients, so you have to pass them to np.polyval
as:
replace_nan = np.polyval(ab,x)
print(replace_nan)
out:
[ 4. 23. 26.54413638 28.01419869 27.00250156
23.10135965 15.90308758 5. -10.01558845 -29.55136312
-54.01500938 -83.81421259 -119.3566581 -161.05003127]