I am trying to merge information from two dataframes with different sizes. See a short cut from the dataframe in the code below.
import pandas as pd
df1 = pd.DataFrame({'ID': [0,1,2,3,4,5,6],
'distance': [55.1, 55.2, 55.3, 55.4, 55.6, 55.7, 55.8 ],
'spec_location': ['','','','','','','',]})
df2 = pd.DataFrame({'ID': [0,1,2],
'start_distance': [55, 55.4, 55.8],
'end_distance': [55.4,55.8,56],
'location': ['PKS', 'DDS', 'LMS']})
Dataframe 1 is a more detailed dataframe and I need to transfer information from dataframe 2 to dataframe 1 based on a condition. The condition states that if distance (from df1) is between the start and end distance (from df2). The location specified in df2 should be filled in at the spec_location in df1. This would then result in a dataframe that looks likes this:
ID distance spec_location
0 0 55.1 PKS
1 1 55.2 PKS
2 2 55.3 PKS
3 3 55.4 PKS
4 4 55.6 DDS
5 5 55.7 DDS
6 6 55.8 DDS
I have tried various loops and merge techniques but I can't seem to get it to work.
CodePudding user response:
Write a mapper function and apply it:
def location(x):
for _, row in df2.iterrows():
if row['start_distance'] < x <= row['end_distance']:
return row['location']
df1['spec_location'] = df1['distance'].apply(location)
Output:
ID distance spec_location
0 0 55.1 PKS
1 1 55.2 PKS
2 2 55.3 PKS
3 3 55.4 PKS
4 4 55.6 DDS
5 5 55.7 DDS
6 6 55.8 DDS