Home > Software engineering >  distance between 2 gps coordinates
distance between 2 gps coordinates

Time:10-20

enter image description hereI have a dataframe with gps Data (longitude and latitude) and a tripId, I wanna calculate the distance between every gps coordinates (every row) for each tripId, is it possible to add a new column "Distance" which contains the results (i will have sum(row)-1 )?

-   timestamp           longitude   latitude    tripId 
0   2021-04-30 21:13:53 8.211610    53.189479   1790767 
1   2021-04-30 21:13:54 8.211462    53.189479   1790767 
2   2021-04-30 21:13:55 8.211367    53.189476   1790767 
3   2021-04-30 21:13:56 8.211343    53.189479   1790767 
4   2021-04-30 21:13:57 8.211335    53.189490   1790767 
5   2021-04-30 21:13:59 8.211338    53.189491   1790767 
6   2021-04-30 21:14:00 8.211299    53.189479   1790767 
7   2021-04-30 21:14:01 8.211311    53.189468   1790767 
8   2021-04-30 21:14:02 8.211327    53.189446   1790767 
9   2021-04-30 21:14:03 8.211338    53.189430   1790767

I've tested it for the first 10 rows but still doesn't work

    import math

def haversine(coord1, coord2):
    R = 6372800 # Earth radius in meters
    lat1, lon1 = coord1
    lat2, lon2 = coord2
    
    phi1, phi2 = math.radians(lat1), math.radians(lat2)
    dphi = math.radians(lat2 - lat1)
    dlambda = math.radians(lon2 - lon1)
    
    a = math.sin(dphi/2)**2   \
        math.cos(phi1)*math.cos(phi2)*math.sin(dlambda/2)**2
    
    return 2*R*math.atan2(math.sqrt(a), math.sqrt(1 - a))


    x= df.tripId[0]
        
    for i in range(0,10):
        while(df.tripId[i]== x):
            coord1= df.latitude[i], df.longitude[i]
            coord2= df.latitude[i 1], df.longitude[i 1]
            df.distance=haversine(coord1, coord2)

CodePudding user response:

The haversine module already contains a function that can directly process vectors. As your input data is already a dataframe, you should use haversine_vector. You can compute directly the distance colum with it even if your dataframe contains more than one idTrip value:

def calc_dist(df):
    s = pd.Series(haversine.haversine_vector(df, df.shift()),
             index=df.index, name='distance')
    return pd.DataFrame(s)

df = pd.concat([df, df.groupby('idTrip')[['latitude', 'longitude']].apply(calc_dist)],
               axis=1)

From your sample data, it gives:

-            timestamp  longitude   latitude   tripId  distance
0  2021-04-30 21:13:53   8.211610  53.189479  1790767       NaN
1  2021-04-30 21:13:54   8.211462  53.189479  1790767  0.009860
2  2021-04-30 21:13:55   8.211367  53.189476  1790767  0.006338
3  2021-04-30 21:13:56   8.211343  53.189479  1790767  0.001633
4  2021-04-30 21:13:57   8.211335  53.189490  1790767  0.001334
5  2021-04-30 21:13:59   8.211338  53.189491  1790767  0.000229
6  2021-04-30 21:14:00   8.211299  53.189479  1790767  0.002921
7  2021-04-30 21:14:01   8.211311  53.189468  1790767  0.001461
8  2021-04-30 21:14:02   8.211327  53.189446  1790767  0.002668
9  2021-04-30 21:14:03   8.211338  53.189430  1790767  0.001924
  • Related