Home > OS >  Find the intersect or nearest geo coordinates from pandas dataframe columns
Find the intersect or nearest geo coordinates from pandas dataframe columns

Time:11-22

I have lat, longs and addresses in a pandas dataframe. A user inputs an address and I'd like to lookup the details associated from pandas dataframe based on the lat, long. Here's my code:

import pandas as pd

df_geo = pd.DataFrame({'Address': ['Addr1','Addr2','Addr3'],
                       'Value': [100, 101, 103],
                       'Lat': [33.515226, 33.51529, 33.515230],
                       'Long': [-112.094456, -112.094459, -112.094464]})

I geocode the address using an API and obtain a list of lat, long.

[33.515227, -112.094457]

How do I find the intersection or nearest coordinates in pandas dataframe and pull Address and Value fields? We have the geocoding API. Pandas DataFrame can be fairly large, so looking for an efficient solution using one of the python geo libraries, if possible.

CodePudding user response:

Use BallTree from sklearn:

import pandas as pd
import numpy as np
from sklearn.neighbors import BallTree

df_geo = pd.DataFrame({'Address': ['Addr1','Addr2','Addr3'],
                       'Value': [100, 101, 103],
                       'Lat': [33.515226, 33.51529, 33.515230],
                       'Long': [-112.094456, -112.094459, -112.094464]})

coords = [33.515227, -112.094457]

X = np.deg2rad(df_geo[['Lat', 'Long']].values)
y = np.deg2rad(np.array([coords]))

tree = BallTree(X, leaf_size=2)
dist, ind = tree.query(y)

Output:

>>> df_geo[['Address', 'Value']].iloc[ind[0][0]].tolist()
['Addr1', 100]

>>> dist
array([[2.46826831e-08]])

>>> ind
array([[0]])

CodePudding user response:

IIUC, use numpy.isclose. Since all the values are really close, below solution will pull all records.

In [862]: import numpy as np
In [863]: lat_long = [33.515227, -112.094457]

In [870]: df_geo[np.isclose(df_geo[['Lat', 'Long']], lat_long)].drop_duplicates()[['Address', 'Value']]
Out[870]: 
  Address  Value
0   Addr1    100
1   Addr2    101
2   Addr3    103
  • Related