Home > Software engineering >  Interpolating column of first dataframe based on second dataframe
Interpolating column of first dataframe based on second dataframe

Time:01-16

Suppose that I have 2 dataframes df1:

col1
0
10
20
30

and df2:

col1     val
0        0
5        2
15       4
25       5
33       8

I want to compute val column for df1 that is linearly interpolated from val in df2 based on col1. Also note that col1 in both dfs are sorted in strictly increasing order.

So df1.val should look as follows:

0 (no interpolation needed since there's an exact match on col1)
3 (interpolate between df2.iloc[1] and df2.iloc[2])
4.5 (interpolate between df2.iloc[2] and df2.iloc[3])
5   5 / 8 * 3 = 6.875 (interpolate between df2.iloc[3] and df2.iloc[4])

One idea I have is to use merge_asof twice and then interpolate.

df1["val_left"] = pd.merge_asof(df1, df2[['col1', 'val']],
                                on='col1',
                                direction='backward')["val"].values
df1["val_right"] = pd.merge_asof(df1, df2[['col1', 'val']],
                                on='col1',
                                direction='forward')["val"].values
# then do the interpolation in df1, but I don't have the df2.col1 info (is there an easy way to get that) to do the interpolation? 

Are there (better) alternative approaches besides the above?

CodePudding user response:

How about using numpy.interp to create a 1D linear interpolation?

>>> import numpy as np
>>> np.interp(x=df1['col1'], xp=df2['col1'], fp=df2['val'])
array([0.   , 3.   , 4.5  , 6.875])

The function fits on xp & fp, and predicts on x.

  • Related