With this DataFrame I am trying to take the start lat/long and end lat/long to create a new column that shows the Haversine distance between the two
import pandas as pd
import haversine as hs
d = {'start_lat': [35.9946, 29.4400,29.4400 ], 'start_long': [-81.7266,-98.4590, -98.4590 ],
'end_lat': [ 36.430124, 29.819364, 29.273085], 'end_long': [-81.179483,-99.142791,-98.836360]}
df = pd.DataFrame(data=d)
df
I can get the Haversine function to work as a standalone function:
def hav(x, y):
return hs.haversine(x, y)
start_coord=(35.9946, -81.7266)
end_coord=(36.430124, -81.179483)
print(hav(start_coord, end_coord))
To try and create new haversine column for the df I have first created two new coordinate columns
df['start_coord'] = list(zip(df.start_lat, df.start_long))
df['end_coord'] = list(zip(df.end_lat, df.end_long))
df
I then try and apply the function in creation of new column but I get a value error: too many values to unpack (expected 2)
df["Haversine_dist"] = hav(df["start_coord"],df["end_coord"])
df
CodePudding user response:
You can use an apply with a lambda here to work on single rows. When you input df['start_coord']
you're using the whole series.
df["Haversine_dist"] = df.apply(lambda x: hav(x["start_coord"], x["end_coord"]), axis=1)