I've got some astrophysics data with unfilled rows, which I'm trying to use the Pandas Interpolate method to fill. It works on the np.NaN values found everywhere except in the first three rows. It fills the values, but instead of being a linear fill, it just places in the values of the fourth row. The first chunk of the dataframe (called avgdf_final) looks like this:
day lon lat rad
nums
0 319.0 NaN NaN NaN
1 320.0 NaN NaN NaN
2 321.0 NaN NaN NaN
3 322.0 56.485 2.7800 1.158
4 323.0 43.300 2.6800 1.166
5 324.0 30.100 2.5775 1.174
I've tried this (with like a million different minor variations) to no avail:
avgdf_final.interpolate(limit_direction='backward')
Every time, I get this result:
day lon lat rad
nums
0 319.0 56.485 2.7800 1.158
1 320.0 56.485 2.7800 1.158
2 321.0 56.485 2.7800 1.158
3 322.0 56.485 2.7800 1.158
4 323.0 43.300 2.6800 1.166
5 324.0 30.100 2.5775 1.174
Clearly, this isn't interpolated: it's just the same rows pasted in again. What can I do to make this work? Thanks in advance for any replies!!
CodePudding user response:
Interpolation with "linear" requires to be in between data, which is not the case here (you rather want to extrapolate).
You could try to use a spline:
df2 = df.interpolate(method='spline', limit_direction='backward', order=1)
See the interpolate
documentation for other methods.
Output:
day lon lat rad
0 319.0 96.0650 3.084167 1.134
1 320.0 82.8725 2.982917 1.142
2 321.0 69.6800 2.881667 1.150
3 322.0 56.4850 2.780000 1.158
4 323.0 43.3000 2.680000 1.166
5 324.0 30.1000 2.577500 1.174