I have a csv that looks like this but is much longer:
ID | Address |
---|---|
32 | 54298 Boca Chica Blvd Brownsville, TX |
31 | 6640 Washington St, Yountville, CA 94599 |
I am attempting to loop through the csv, use geopy to get latitudes and longitudes for each address, and then insert them into a 3rd and 4th column that would look like this:
ID | Address | Latitude | Longitude |
---|---|---|---|
32 | 54298 Boca Chica Blvd Brownsville, TX | 34.5165131 | 90.1656516 |
31 | 6640 Washington St, Yountville, CA 94599 | 26.1231 | 51.516125 |
It doesn't really matter if it is in the same old csv (that's what I was trying), or into a new csv. The previous posts I have read either focus on appending whole rows instead of columns, or they just populate the new columns with data that is hand typed/hard coded into the python script.
So far I am able to loop through and get the latitudes and longitudes. The problem I am having is primarily on the writing part. Here is what I have.
import geopy
import pandas as pd
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="sample app")
df = pd.read_csv(r"C:\Users\Desktop\LatLongTest.csv")
for row in df.itertuples():
try:
data = geolocator.geocode(row[7])
print(data.point.latitude)
except AttributeError:
print('error')
Any help would be greatly appreciated! I only have a bit of Java experience from back in the day. The documentation and published examples for Python are not as intuitive to me as Java's was, so getting started is a bit of a challenge.
CodePudding user response:
All you need to to add columns and then save the df
dataframe. See also Adding new column to existing DataFrame in Python pandas
You can try something like this:
import geopy
import pandas as pd
from geopy.geocoders import Nominatim
geolocator = Nominatim(user_agent="sample app")
df = pd.read_csv(r"C:\Users\Desktop\LatLongTest.csv")
# add new columns
df['Longitude'] = 0
df['Latitude'] = 0
# process all rows
for row in df.itertuples():
try:
data = geolocator.geocode(row[7])
df.at[row.Index, 'Longitude'] = data.point.longitude
df.at[row.Index, 'Latitude'] = data.point.latitude
except AttributeError:
print('error')
# write to output
df.to_csv(r"C:\Users\Desktop\LatLongTest_2.csv", sep=";", index=False)
CodePudding user response:
With this method I'll show, you'll need to create two list, one with the Latitude and another one with the Longitude and later append as a Column to your DataFrame. Just let me remind you that they must have the same amount of rows, or it will give an error.
import pandas as pd
df = pd.read_csv(r"C:\Users\Desktop\LatLongTest.csv")
add_to_df = {'Latitude': LatitudeList, 'Longitude': LongitudeList}
df = df.append(pd.DataFrame(add_to_df))
df
It is also possible to do it like this, adding a value to each row:
import pandas as pd
df = pd.read_csv(r"C:\Users\Desktop\LatLongTest.csv")
df.loc[df.Adress == '54298 Boca Chica Blvd Brownsville, TX', "Latitude"] = Latitude
df.loc[df.Adress == '54298 Boca Chica Blvd Brownsville, TX', "Longitude"] = Longitude
df
You can check out more resources at this link