I have a problem. I want to get the coordinates long
and lat
from the address. I want to check directly in the method whether this address already has a long
and lat
value and if so, should this be taken and not queried again via geolocator.geocode(df['address'])
. Unfortunately I got an error ValueError: Columns must be same length as key
.
Dataframe
address customer
0 Surlej, 7513, Silvaplana, Schweiz 1
1 Vodnikova cesta 35, 1000 Ljubljana, Slowenien 2
2 Surlej, 7513, Silvaplana, Schweiz 1
Code
from functools import lru_cache
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent='testing_stackoverflow')
import pandas as pd
d = {
"address": ['Surlej, 7513, Silvaplana, Schweiz', 'Vodnikova cesta 35, 1000 Ljubljana, Slowenien', 'Surlej, 7513, Silvaplana, Schweiz',],
"customer": [1, 2, 1],
}
df = pd.DataFrame(data=d)
print(df)
@lru_cache(maxsize=None)
def function_that_returns_lat_lon_from_address(address):
location = geolocator.geocode(address, timeout=10)
print(location)
try:
if (location == None):
return(None, None)
else:
return (location.latitude, location.longitude)
except GeocoderTimedOut as e:
print("Timeout ", e)
return(None, None)
df[['lat', 'lon']] = df['address'].apply(function_that_returns_lat_lon_from_address)
What I want
address customer latitude \
0 Surlej, 7513, Silvaplana, Schweiz 1 46.459902
1 Vodnikova cesta 35, 1000 Ljubljana, Slowenien 2 46.065523
2 Surlej, 7513, Silvaplana, Schweiz 1 46.459902
longitude
0 9.803370
1 14.490775
2 9.803370
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-8-4873cdd27090> in <module>()
24 return(None, None)
25
---> 26 df[['lat', 'lon']] = df['address'].apply(function_that_returns_lat_lon_from_address)
2 frames
/usr/local/lib/python3.7/dist-packages/pandas/core/frame.py in _iset_not_inplace(self, key, value)
3673 if self.columns.is_unique:
3674 if np.shape(value)[-1] != len(key):
-> 3675 raise ValueError("Columns must be same length as key")
3676
3677 for i, col in enumerate(key):
ValueError: Columns must be same length as key
CodePudding user response:
Solution to the problem
Convert series into list of tuples so that you have two items on each row to assign back to two columns. In this case pandas will automatically take care of unpacking tuples and assigning the unpacked values back to two columns
A slightly faster solution
df[['lat', 'lon']] = list(map(function_that_returns_lat_lon_from_address, df.address))
Or you can also fix your code by simply adding .tolist
conversion,
df[['lat', 'lon']] = df['address'].apply(function_that_returns_lat_lon_from_address).tolist()