Home > Blockchain >  Merge two pandas dataframes that have slightly different values on the column which is being merged
Merge two pandas dataframes that have slightly different values on the column which is being merged

Time:09-04

How can I merge two data frames when the column has a slight offset than the column I am merging to?

df1 =

col1 col2
1 a
2 b
3 c

df2 =

col1 col3
1.01 d
2 e
2.95 f

so, the merged column would end up like this even though the values in col1 are slightly different.

df_merge =

col1 col2 col3
1 a d
2 b e
3 c f

I have seen scenarios like this where "col1" is a string, but I'm wondering if it's possible to do this with something like pandas.merge() in the scenario where there is slight numerical offset (e.g /- 0.05).

CodePudding user response:

Lets do merge_asof with tolerance parameter

pd.merge_asof(
    df1.astype({'col1': 'float'}).sort_values('col1'), 
    df2.sort_values('col1'), 
    on='col1', 
    direction='nearest', 
    tolerance=.05
)

   col1 col2 col3
0   1.0    a    d
1   2.0    b    e
2   3.0    c    f

PS: if the dataframes are already sorted on col1 then there is no need to sort again.

  • Related