I just came across the merge_asof recently and found it to be great for merging two dataframes with similar but slightly different times. Can we use this technique to merge two dataframes based on lat & lon coordinates, rather than times? One of my data frames looks like this.
Latitude Longitude geometry
0 40.457794 -86.914398 POINT (40.45779 -86.91440)
123 40.457794 -86.914398 POINT (40.45779 -86.91440)
246 40.457794 -86.914398 POINT (40.45779 -86.91440)
369 40.457794 -86.914398 POINT (40.45779 -86.91440)
492 40.457794 -86.914398 POINT (40.45779 -86.91440)
The other looks like this.
Vehicle_ID Latitude Longitude geometry
0 1233 39.355 -85.220 POINT (39.35500 -85.22000)
1 3033 40.429 -84.346 POINT (40.42900 -84.34600)
2 2202 39.125 -84.823 POINT (39.12500 -84.82300)
3 4011 40.892 -85.974 POINT (40.89200 -85.97400)
4 4432 40.862 -84.371 POINT (40.86200 -84.37100)
I'm trying to follow the documentation here.
https://pandas.pydata.org/pandas-docs/version/0.25.0/reference/api/pandas.merge_asof.html
I tried the following ideas.
df_final = pd.merge_asof(gdf1,gdf2[['geometry']],on='geometry',direction='nearest')
df_final = pd.merge_asof(gdf1, gdf2, on='geometry', direction='nearest')
df_final = pd.merge_asof(df_merged,df_gps['Circuit_Latitude'].sort_values('Circuit_Latitude'),on='Circuit_Latitude')
Nothing is working. I tried to use geopandas
to do the merge, but I couldn't get the library installed. BTW, this doesn't have to be super accurate. If the lat & lon are 3, 4, or 5 miles away, it's fine. I'm just trying to get something in the ballpark area to match up! Or, is there a better way to do this kind of thing? Thanks.
CodePudding user response:
I am guessing that the issue here is the different types of dataframes or some sort of incompatibility between the libraries
What I would do is to check the types of your dataframes, see if they are actually pandas Dataframe, if not I would convert them to the type the method is expecting
gpd_pd1 = pd.DataFrame(gdf1)
gpd_pd2 = pd.DataFrame(gdf2)
And then do the merge_asof method, the usage of the method itself looks alright to me
pd.merge_asof(gpd_pd1, gpd_pd2, on='geometry', direction='nearest')
CodePudding user response:
I don't think 'pd-merge-asof
' handles lat & lon coordinates. This worked for me
import pandas as pd
df1 = pd.read_csv('C:\\Users\\ryans\\Desktop\\df1.csv')
df2 = pd.read_csv('C:\\Users\\ryans\\Desktop\\df2.csv')
# must be float64
print(df1.dtypes)
print(df2.dtypes)
import geopandas
gdf_merged = geopandas.GeoDataFrame(df_merged, geometry=geopandas.points_from_xy(df_merged.Latitude, df_merged.Longitude))
gdf_gps = geopandas.GeoDataFrame(df_gps, geometry=geopandas.points_from_xy(df_gps.Latitude, df_gps.Longitude))
df_final = geopandas.sjoin_nearest(gdf_merged, gdf_gps)
df_final.head()