How can I merge two data frames when the column has a slight offset than the column I am merging to?
df1 =
col1 | col2 |
---|---|
1 | a |
2 | b |
3 | c |
df2 =
col1 | col3 |
---|---|
1.01 | d |
2 | e |
2.95 | f |
so, the merged column would end up like this even though the values in col1 are slightly different.
df_merge =
col1 | col2 | col3 |
---|---|---|
1 | a | d |
2 | b | e |
3 | c | f |
I have seen scenarios like this where "col1" is a string, but I'm wondering if it's possible to do this with something like pandas.merge()
in the scenario where there is slight numerical offset (e.g /- 0.05).
CodePudding user response:
Lets do merge_asof
with tolerance
parameter
pd.merge_asof(
df1.astype({'col1': 'float'}).sort_values('col1'),
df2.sort_values('col1'),
on='col1',
direction='nearest',
tolerance=.05
)
col1 col2 col3
0 1.0 a d
1 2.0 b e
2 3.0 c f
PS: if the dataframes are already sorted on col1
then there is no need to sort again.