Home > database >  Pandas DataFrame, Haversine function of 4 lat/long columns to new column
Pandas DataFrame, Haversine function of 4 lat/long columns to new column

Time:02-21

With this DataFrame I am trying to take the start lat/long and end lat/long to create a new column that shows the Haversine distance between the two

    import pandas as pd
import haversine as hs

d = {'start_lat': [35.9946, 29.4400,29.4400 ], 'start_long': [-81.7266,-98.4590, -98.4590 ],
     'end_lat': [ 36.430124, 29.819364, 29.273085], 'end_long': [-81.179483,-99.142791,-98.836360]}
df = pd.DataFrame(data=d)
df

I can get the Haversine function to work as a standalone function:

def hav(x, y):
    return hs.haversine(x, y)

start_coord=(35.9946, -81.7266)
end_coord=(36.430124, -81.179483)

print(hav(start_coord, end_coord))

To try and create new haversine column for the df I have first created two new coordinate columns

df['start_coord'] = list(zip(df.start_lat, df.start_long))
df['end_coord'] = list(zip(df.end_lat, df.end_long))
df

I then try and apply the function in creation of new column but I get a value error: too many values to unpack (expected 2)

df["Haversine_dist"] = hav(df["start_coord"],df["end_coord"])
df

CodePudding user response:

You can use an apply with a lambda here to work on single rows. When you input df['start_coord'] you're using the whole series.

df["Haversine_dist"] = df.apply(lambda x: hav(x["start_coord"], x["end_coord"]), axis=1)
  • Related