I have two dataframes colored by approximately matching marks:
The "marks" are not the same in each of them, but some are close. How can I copy the "Evaluated" value from df2 to df1 based on relevant "name" and "mark"?
My code is:
df1 = pd.DataFrame({'Name': ['Lisa', 'Lisa', 'Lisa', 'Hann', 'Hann', 'Hann'],
'Marks': [25.123, 26.425, 27.456, 25.789, 26.124, 26.225],
'Evaluated':['','','','','','']})
df2 = pd.DataFrame({'Name':['Lisa', 'Lisa', 'Lisa', 'Lisa', 'Hann', 'Hann', 'Hann', 'Hann'],
'Marks':[25.125, 26.422, 27.451, 27.465, 25.786, 25.796, 26.121, 26.227],
'Evaluated':[0, 0, 1, 1, 1, 1, 1, 1]})
df3 = pd.merge(df1.round(2),
df2.round(2),
how='left',
on=['Name', 'Marks'])
Expected result is df3
How can I do an approximate match and get the value of the last column? I tried to use df.loc and df.where but they didn't work because tables are in different shapes. What I expect is similar function to Excel's Vlookup function where approximation is True. My code changes the values at the end, which I would love to keep as it was in df1. Probably I could make a copy from what I had before, but I believe there is a more pythonic way to solve it, rather than merging the tables.
Thanks in advance!
CodePudding user response:
You can try pandas.merge_asof
df1 = df1.sort_values(['Marks'])
df2 = df2.sort_values(['Marks'])
df3 = pd.merge_asof(df1[['Name', 'Marks']],
df2,
on='Marks',
direction='nearest',
by='Name')
print(df3)
Name Marks Evaluated
0 Lisa 25.123 0
1 Hann 25.789 1
2 Hann 26.124 1
3 Hann 26.225 1
4 Lisa 26.425 0
5 Lisa 27.456 1