I am having a dataframe with station info including latitudes and longitudes as follows:
start_lat start_lng end_lat end_lng
41.877726 -87.654787 41.888716 -87.644448
41.930000 -87.700000 41.910000 -87.700000
41.910000 -87.690000 41.930000 -87.700000
and like wise.
I want to create a distance column from these info where the distance can be either in km or in miles for distance between these start and end points.
(As shared in the kin below, when I try to implement the SO answer, I encounter an error.)
from math import sin, cos, sqrt, atan2
dlon = data.end_lng - data.start_lng
dlat = data.end_lat - data.start_lat
a = ((sin(dlat/2))**2 cos(lat1) * cos(lat2) * (sin(dlon/2))**2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
data['distance'] = R * c
TypeError Traceback (most recent call last)
<ipython-input-8-a8f8b698a81b> in <module>()
2 dlon = data.end_lng - data.start_lng
3 dlat = data.end_lat - data.start_lat
----> 4 a = ((sin(dlat/2))**2 cos(lat1) * cos(lat2) * (sin(dlon/2))**2).apply(lambda x: float(x))
5 c = 2 * atan2(sqrt(a), sqrt(1-a))
6 data['distance'] = R * c
/usr/local/lib/python3.7/dist-packages/pandas/core/series.py in wrapper(self)
127 if len(self) == 1:
128 return converter(self.iloc[0])
--> 129 raise TypeError(f"cannot convert the series to {converter}")
130
131 wrapper.__name__ = f"__{converter.__name__}__"
TypeError: cannot convert the series to <class 'float'>
<iframe name="sif1" sandbox="allow-forms allow-modals allow-scripts" frameborder="0"></iframe>
How to resolve this?
CodePudding user response:
You need to do the calculation on every row, one way is to use itterows (no guarantee on the distance calculation itself):
def get_distance(row, R = 6371): #km
dlon = row[1]['end_lng'] - row[1]['start_lng']
dlat = row[1]['end_lat'] - row[1]['start_lat']
a = ((sin(dlat/2))**2 cos(row[1]['start_lat']) * cos(row[1]['end_lat']) * (sin(dlon/2))**2)
c = 2 * atan2(sqrt(a), sqrt(1-a))
return R * c
data['distance'] = [get_distance(row) for row in data.iterrows()]