Home > front end >  Calculating distance between all locations to first location, by group
Calculating distance between all locations to first location, by group

Time:01-10

I have GPS locations from several seabird tracks, each starting from colony x. Therefore the individual tracks all have similar first locations. For each track, I would like to calculate the beeline distance between each GPS location and either (a) a specified location that represents the location of colony x, or (b) the first GPS point of a given track which represents the location of colony x. For (b), I would look to use the first location of each new track ID (track_id).

I have looked for appropriate functions in geosphere, sp, raster, adehabitatLT, move, ... and just cannot seem to find what I am looking for.

I can calculate the distance between successive GPS points, but that is not what I need.

package(dplyr)
df %>%  
  group_by(ID) %>%
  mutate(lat_prev = lag(Lat,1), lon_prev = lag(Lon,1) ) %>%
  mutate(dist = distVincentyEllipsoid(matrix(c(lon_prev, lat_prev), ncol = 2), # or use distHaversine
                                      matrix(c(Lon, Lat), ncol = 2))) 

#example data:

df <- data.frame(Lon = c(-96.8, -96.60861, -96.86875, -96.14351, -92.82518, -90.86053, -90.14208, -84.64081, -83.7, -82, -80, -88.52732, -94.46049,-94.30, -88.60, -80.50, -81.70, -83.90, -84.60, -90.10, -90.80, -92.70, -96.10, -96.55, -96.50, -96.00),
                   Lat = c(25.38657, 25.90644, 26.57339, 27.63348, 29.03572, 28.16380, 28.21235, 26.71302, 25.12554, 24.50031, 24.89052, 30.16034, 29.34550,
                           29.34550, 30.16034, 24.89052, 24.50031, 25.12554, 26.71302, 28.21235, 28.16380, 29.03572, 27.63348, 26.57339, 25.80000, 25.30000) ,
                   ID = c(rep("ID1", 13), rep("ID2", 13))

Grateful for any pointers.

CodePudding user response:

You were pretty close. The key is that you want to calcualte the distance from the first observation in each track. Therefore you need to first adorn with the order in each track (easy to do with dplyr::row_number()). Then for the distance calculation, make the reference observation always the first by subsetting with order == 1.

library(tidyverse)
library(geosphere)

df <- data.frame(Lon = c(-96.8, -96.60861, -96.86875, -96.14351, -92.82518, -90.86053, -90.14208, -84.64081, -83.7, -82, -80, -88.52732, -94.46049,-94.30, -88.60, -80.50, -81.70, -83.90, -84.60, -90.10, -90.80, -92.70, -96.10, -96.55, -96.50, -96.00),
                 Lat = c(25.38657, 25.90644, 26.57339, 27.63348, 29.03572, 28.16380, 28.21235, 26.71302, 25.12554, 24.50031, 24.89052, 30.16034, 29.34550, 29.34550, 30.16034, 24.89052, 24.50031, 25.12554, 26.71302, 28.21235, 28.16380, 29.03572, 27.63348, 26.57339, 25.80000, 25.30000),
                 ID = c(rep("ID1", 13), rep("ID2", 13)))
                 
                 
df %>%  
  group_by(ID) %>%
  mutate(order = row_number()) %>% 
  mutate(dist = distVincentyEllipsoid(matrix(c(Lon[order == 1], Lat[order == 1]), ncol = 2), 
                                      matrix(c(Lon, Lat), ncol = 2)))
#> # A tibble: 26 x 5
#> # Groups:   ID [2]
#>      Lon   Lat ID    order     dist
#>    <dbl> <dbl> <chr> <int>    <dbl>
#>  1 -96.8  25.4 ID1       1       0 
#>  2 -96.6  25.9 ID1       2   60714.
#>  3 -96.9  26.6 ID1       3  131665.
#>  4 -96.1  27.6 ID1       4  257404.
#>  5 -92.8  29.0 ID1       5  564320.
#>  6 -90.9  28.2 ID1       6  665898.
#>  7 -90.1  28.2 ID1       7  732131.
#>  8 -84.6  26.7 ID1       8 1225193.
#>  9 -83.7  25.1 ID1       9 1319482.
#> 10 -82    24.5 ID1      10 1497199.
#> # ... with 16 more rows

Created on 2022-01-09 by the reprex package (v2.0.1)

CodePudding user response:

This also seems to work (sent to me by a friend) - very similar to Dan's suggestion above, but slightly different

library(geosphere)
library(dplyr)

df %>% 
  group_by(ID) %>%
  mutate(Dist_to_col = distHaversine(c(Lon[1], Lat[1]),cbind(Lon,Lat)))
  •  Tags:  
  • Related