Home > Back-end >  How do I use Pandas to interpolate on the first few rows of a dataframe?
How do I use Pandas to interpolate on the first few rows of a dataframe?

Time:02-22

I've got some astrophysics data with unfilled rows, which I'm trying to use the Pandas Interpolate method to fill. It works on the np.NaN values found everywhere except in the first three rows. It fills the values, but instead of being a linear fill, it just places in the values of the fourth row. The first chunk of the dataframe (called avgdf_final) looks like this:

        day     lon     lat    rad
nums                              
0     319.0     NaN     NaN    NaN
1     320.0     NaN     NaN    NaN
2     321.0     NaN     NaN    NaN
3     322.0  56.485  2.7800  1.158
4     323.0  43.300  2.6800  1.166
5     324.0  30.100  2.5775  1.174

I've tried this (with like a million different minor variations) to no avail:

avgdf_final.interpolate(limit_direction='backward')

Every time, I get this result:

        day     lon     lat    rad
nums                              
0     319.0  56.485  2.7800  1.158
1     320.0  56.485  2.7800  1.158
2     321.0  56.485  2.7800  1.158
3     322.0  56.485  2.7800  1.158
4     323.0  43.300  2.6800  1.166
5     324.0  30.100  2.5775  1.174

Clearly, this isn't interpolated: it's just the same rows pasted in again. What can I do to make this work? Thanks in advance for any replies!!

CodePudding user response:

Interpolation with "linear" requires to be in between data, which is not the case here (you rather want to extrapolate).

You could try to use a spline:

df2 = df.interpolate(method='spline', limit_direction='backward', order=1)

See the interpolate documentation for other methods.

Output:

     day      lon       lat    rad
0  319.0  96.0650  3.084167  1.134
1  320.0  82.8725  2.982917  1.142
2  321.0  69.6800  2.881667  1.150
3  322.0  56.4850  2.780000  1.158
4  323.0  43.3000  2.680000  1.166
5  324.0  30.1000  2.577500  1.174
  • Related