I have two data frames, the first one has the longitude and latitude for the meters, and the second data frame has the longitude and latitude of the stations. I am trying to link them by closest match. Here is my example:
df_id = pd.DataFrame()
df_id ['id'] = [1, 2]
df_id['lat'] = [32, 55]
df_id['long'] = [-89, -8]
Here is the station dataframe:
df_station = pd.DataFrame()
df_station ['id'] = [10, 20]
df_station['lat'] = [33, 56]
df_station['long'] = [-88.23, -7]
and here is the output:
CodePudding user response:
Pandas' cross merge should help to pair ids and stations
# cross merge the 2 dfs to pair all ids with stations
df_merged = df_id.merge(df_station.add_suffix('_station'), how='cross')
# find euclidean distance between all pairs of locations
df_merged['distance'] = ((df_merged.lat - df_merged.lat_station)**2 (df_merged.long - df_merged.long_station)**2).pow(0.5)
# filter the closest station for each id
df_merged.loc[df_merged.groupby('id')['distance'].idxmin(), ['id', 'id_station']]