I have a list of 200 latitude and longitude coordinate pairs.
For each coordinate pair I want to create a dataframe which contains column district and column state. So my dataframe will have 3 columns cord, district and state
.
For this I am using geopy library but I am unable to get record for more than 115 coordinates.
Sample Data
cord
0 (19.4, 17.93)
1 (55.54, 93.93)
2 (52.45, 78.93)
3 (65.54, 67.93)
4 (47.74, 99.93)
Required Output Demo
cord district state
0 (19.4, 17.93) xyz aaa
1 (55.54, 93.93) adc aaa
2 (52.45, 78.93) gyu drt
3 (65.54, 67.93) www bhn
4 (47.74, 99.93) ccf bvg
I have tried this code but unable to get fetch details for more than 115 queries.
from geopy.geocoders import Nominatim
district = {} # Initialize empty dict
geo_loc # List containing all the codrinates in this format (lat, long)
for cord in geo_loc:
geolocator = Nominatim(user_agent='user_agent')
location = geolocator.reverse(cord, addressdetails=True)
district[cord] = location.raw['address']['state_district']
I need to fetch maximum of 500 unique coordinates at one time.
Also I need district and state name both in separate columns.
CodePudding user response:
From Nominatim Usage Policy they require not to do heavy usage i.e. maximum 1 request per second. "No heavy uses (an absolute maximum of 1 request per second)." You can use geopy's RateLimiter to send 1 request per second. I've tested the following code works for more than 115 requests:
from geopy.extra.rate_limiter import RateLimiter
from geopy.geocoders import Nominatim
import pandas as pd
geolocator = Nominatim(user_agent="user_agent")
# add rate limit
reverse = RateLimiter(geolocator.reverse, min_delay_seconds=1)
state_list = [] # Initialize empty dict
# create dataframe
df = pd.DataFrame({"geo_loc" :[(19.4, 17.93), (55.54, 93.93),(52.45, 78.93), (65.54, 67.93), (47.74, 99.93) ]})
# get location coordinates
geo_loc = df.geo_loc.values
for cord in geo_loc:
# send request
location = reverse(cord, addressdetails=True)
# get state value
state = location.raw["address"].get("state")
# store state value
state_list.append(state)
# assign back states
df['states'] = state_list
print(df)
Resulting dataframe:
geo_loc states
0 (19.4, 17.93) Tibesti تيبستي
1 (55.54, 93.93) Красноярский край
2 (52.45, 78.93) Алтайский край
3 (65.54, 67.93) Ямало-Ненецкий автономный округ
4 (47.74, 99.93) Архангай