Min Max and Mean values from Distance Matrix in R-CodePudding

I have x,y coordinates of cells grouped by a patient ID as such:

PatientID	cX	cY
1	5348	4902
1	6360	4887
1	5398	4874
2	5348	4902
2	6360	4887
2	5398	4874

Where each row is x,y of an individual cell and its associated patient ID.

Essentially what I want to do is create a distance matrix for each patient ID and calculate the minimum, maximum, and mean distance values for each individual cell and add them as columns to the original dataframe.

CodePudding user response：

I made your data into a usable form with:

patient_data <- data.frame(
  PatientId = c(1, 1, 1, 2, 2, 2),
  cX = c(5348, 6360, 5398, 5348, 6360, 5398),
  cY = c(4902, 4887, 4874, 4902, 4887, 4874)
)

  PatientId   cX   cY
1         1 5348 4902
2         1 6360 4887
3         1 5398 4874
4         2 5348 4902
5         2 6360 4887
6         2 5398 4874

Then you are looking for the dplyr::group_by and dplyr::group_modify functions. You can use dplyr::group_map to check the output from the different steps.

library(magrittr)

patient_data %>%
  dplyr::group_by(PatientId) %>%
  dplyr::group_modify(~ {
    distance_matrix <- .x %>% dist(diag = FALSE, upper = TRUE) %>% as.matrix() # get distance matrix
    diag(distance_matrix) <- NA # set diagonal values to NA
  data.frame( # get min/max/avg for each row of the distance matrix
    cell_id = seq(nrow(distance_matrix)),
    min_dist = apply(distance_matrix, MARGIN = 1, FUN = min, na.rm = TRUE),
    max_dist = apply(distance_matrix, MARGIN = 1, FUN = max, na.rm = TRUE),
    avg_dist = apply(distance_matrix, MARGIN = 1, FUN = mean, na.rm = TRUE)
  )
  })

# A tibble: 6 × 5
# Groups:   PatientId [2]
  PatientId cell_id min_dist max_dist avg_dist
      <dbl>   <int>    <dbl>    <dbl>    <dbl>
1         1       1     57.3    1012.     535.
2         1       2    962.     1012.     987.
3         1       3     57.3     962.     510.
4         2       1     57.3    1012.     535.
5         2       2    962.     1012.     987.
6         2       3     57.3     962.     510.