I would like to know difference of position of same individuals between different groups. In other words, the difference of position between the "B150" (df$ID
) of the groups "1" & and the "B150" of the group "2" (df$date
), and repeat that for all individuals ("B145", "B140",...). The distance between different individuals (e.g. "B150" & "B145") does not interest me.
Here is the sample of the dataset:
df <- structure(list(ID = c("B150", "B145", "B140", "B136", "B150",
"B145", "B140", "B136"), Ellipsoid_height_m = c(155.5, 155.5,
155.4, 155.3, 155.5, 155.5, 155.4, 155.3), X_Lambert_72_m = c(232762.455,
232763.271, 232764.444, 232765.093, 232764.955, 232765.771, 232766.944,
232767.593), Y_Lambert_72_m = c(125994.937, 125996.489, 125997.991,
125998.854, 125994.937, 125996.489, 125997.991, 125998.854),
Z_DNG_plus130cm = c(111.102, 111.102, 111.002, 110.902, 111.102,
111.102, 111.002, 110.902), Z_DNG = c(109.802, 109.802, 109.702,
109.602, 109.802, 109.802, 109.702, 109.602), Validite_Z = c("Non",
"Non", "Non", "Non", "Non", "Non", "Non", "Non"), Type = c("Pittag",
"Pittag", "Pittag", "Pittag", "Pittag", "Pittag", "Pittag",
"Pittag"), date = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 2L,
2L), .Label = c("1", "2"), class = "factor")), class = c("tbl_df",
"tbl", "data.frame"), row.names = c(NA, -8L))
CodePudding user response:
Here is a way.
In order to make things simple, keep only the variables of relevant to the distance, reshape to wide format putting the same individuals in the same row and compute the distances.
suppressPackageStartupMessages({
library(dplyr)
library(tidyr)
})
euclid <- function(x1, y1, x2, y2) {
sqrt( (x1-x2)^2 (y1-y2)^2 )
}
df %>%
select(ID, contains("Lambert"), date) %>%
pivot_wider(
id_cols = ID,
names_from = date,
values_from = contains("Lambert")
) %>%
mutate(Dist = euclid(X_Lambert_72_m_1, Y_Lambert_72_m_1,
X_Lambert_72_m_2, Y_Lambert_72_m_2)) %>%
select(ID, Dist)
#> # A tibble: 4 × 2
#> ID Dist
#> <chr> <dbl>
#> 1 B150 2.5
#> 2 B145 2.5
#> 3 B140 2.5
#> 4 B136 2.5
Created on 2022-10-01 with reprex v2.0.2
CodePudding user response:
A combination of pivot_longer()
and pivot_wider()
from the tidyr
package will get the 2 dates onto the same row for each ID and variable, then use summarise()
from dplyr
to subtract the respective coordinates and get the square root of the sums. Finally another pivot_wider()
will get the data in a similar layout to its original shape. (Leave out this last pivot if you're happy with the data in long format.)
I'm assuming that X_Lambert_72_m
, Y_Lambert_72_m
and Z_DNG
are your x, y and z cordinates from which to calculate the distance. If not, then change the variables in the select()
line.
library(tidyr)
library(dplyr)
want <- df %>%
select(ID, date, X_Lambert_72_m, Y_Lambert_72_m, Z_DNG) %>% # keep ID, date and co-ordinates
pivot_longer(cols=3:5,names_to='measure') %>% # pivot to longer format with one column of measure values
pivot_wider(names_from=date) %>% # pivot wider to get one column for each date
group_by(ID) %>% # group to get one result per ID
summarise(distance=sqrt(sum((`2`-`1`)^2))) # calculate distnace