enter image description hereI have a dataframe with gps Data (longitude and latitude) and a tripId, I wanna calculate the distance between every gps coordinates (every row) for each tripId, is it possible to add a new column "Distance" which contains the results (i will have sum(row)-1 )?
- timestamp longitude latitude tripId
0 2021-04-30 21:13:53 8.211610 53.189479 1790767
1 2021-04-30 21:13:54 8.211462 53.189479 1790767
2 2021-04-30 21:13:55 8.211367 53.189476 1790767
3 2021-04-30 21:13:56 8.211343 53.189479 1790767
4 2021-04-30 21:13:57 8.211335 53.189490 1790767
5 2021-04-30 21:13:59 8.211338 53.189491 1790767
6 2021-04-30 21:14:00 8.211299 53.189479 1790767
7 2021-04-30 21:14:01 8.211311 53.189468 1790767
8 2021-04-30 21:14:02 8.211327 53.189446 1790767
9 2021-04-30 21:14:03 8.211338 53.189430 1790767
I've tested it for the first 10 rows but still doesn't work
import math
def haversine(coord1, coord2):
R = 6372800 # Earth radius in meters
lat1, lon1 = coord1
lat2, lon2 = coord2
phi1, phi2 = math.radians(lat1), math.radians(lat2)
dphi = math.radians(lat2 - lat1)
dlambda = math.radians(lon2 - lon1)
a = math.sin(dphi/2)**2 \
math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
return 2*R*math.atan2(math.sqrt(a), math.sqrt(1 - a))
x= df.tripId[0]
for i in range(0,10):
while(df.tripId[i]== x):
coord1= df.latitude[i], df.longitude[i]
coord2= df.latitude[i 1], df.longitude[i 1]
df.distance=haversine(coord1, coord2)
CodePudding user response:
The haversine module already contains a function that can directly process vectors. As your input data is already a dataframe, you should use haversine_vector
. You can compute directly the distance colum with it even if your dataframe contains more than one idTrip value:
def calc_dist(df):
s = pd.Series(haversine.haversine_vector(df, df.shift()),
index=df.index, name='distance')
return pd.DataFrame(s)
df = pd.concat([df, df.groupby('idTrip')[['latitude', 'longitude']].apply(calc_dist)],
axis=1)
From your sample data, it gives:
- timestamp longitude latitude tripId distance
0 2021-04-30 21:13:53 8.211610 53.189479 1790767 NaN
1 2021-04-30 21:13:54 8.211462 53.189479 1790767 0.009860
2 2021-04-30 21:13:55 8.211367 53.189476 1790767 0.006338
3 2021-04-30 21:13:56 8.211343 53.189479 1790767 0.001633
4 2021-04-30 21:13:57 8.211335 53.189490 1790767 0.001334
5 2021-04-30 21:13:59 8.211338 53.189491 1790767 0.000229
6 2021-04-30 21:14:00 8.211299 53.189479 1790767 0.002921
7 2021-04-30 21:14:01 8.211311 53.189468 1790767 0.001461
8 2021-04-30 21:14:02 8.211327 53.189446 1790767 0.002668
9 2021-04-30 21:14:03 8.211338 53.189430 1790767 0.001924