Suppose that I have 2 dataframes df1
:
col1
0
10
20
30
and df2
:
col1 val
0 0
5 2
15 4
25 5
33 8
I want to compute val
column for df1
that is linearly interpolated from val
in df2
based on col1
. Also note that col1
in both dfs are sorted in strictly increasing order.
So df1.val
should look as follows:
0 (no interpolation needed since there's an exact match on col1)
3 (interpolate between df2.iloc[1] and df2.iloc[2])
4.5 (interpolate between df2.iloc[2] and df2.iloc[3])
5 5 / 8 * 3 = 6.875 (interpolate between df2.iloc[3] and df2.iloc[4])
One idea I have is to use merge_asof
twice and then interpolate.
df1["val_left"] = pd.merge_asof(df1, df2[['col1', 'val']],
on='col1',
direction='backward')["val"].values
df1["val_right"] = pd.merge_asof(df1, df2[['col1', 'val']],
on='col1',
direction='forward')["val"].values
# then do the interpolation in df1, but I don't have the df2.col1 info (is there an easy way to get that) to do the interpolation?
Are there (better) alternative approaches besides the above?
CodePudding user response:
How about using numpy.interp
to create a 1D linear interpolation?
>>> import numpy as np
>>> np.interp(x=df1['col1'], xp=df2['col1'], fp=df2['val'])
array([0. , 3. , 4.5 , 6.875])
The function fits on xp
& fp
, and predicts on x
.