Home > Blockchain >  Iterative function to subtract all possible permutations in R
Iterative function to subtract all possible permutations in R

Time:11-23

I have a data frame...

example <- data.frame(obs_val= c(20,15,3,7,5), patient = c("pt1","pt2","pt3","pt4","pt5"))

... where every row or "patient" is a unique observation.

My goal is to generate a data frame that subtracts each patient's observed value (obs_val) from another patient's obs_val. This subtraction would be a permutation, where i.e. pt1 does not have their own obs_val subtracted from their self. Ideally, the final data frame should look something like the following:

             pt1-pt2    pt1-pt3    pt1-pt4    pt1-pt5    pt2-pt3    pt2-pt4    ...

obs_val_diff    5          17         13         15         12         8       ...

Any suggestions on solving this problem, or reformatting the final data frame, are greatly appreciated. Thank you!

CodePudding user response:

You can just join the dataframe on itself, remove the rows where the patient has been matched to itself, and retain the differences:

library(data.table)
library(magrittr)

setDT(example)
example[,id:=1][example, on=.(id), allow.cartesian=T] %>% 
  .[patient!=i.patient] %>% 
  .[, .(p1 = i.patient, p2=patient, p1_minus_p2=i.obs_val-obs_val)]

Output:

     p1  p2 p1_minus_p2
 1: pt1 pt2           5
 2: pt1 pt3          17
 3: pt1 pt4          13
 4: pt1 pt5          15
 5: pt2 pt1          -5
 6: pt2 pt3          12
 7: pt2 pt4           8
 8: pt2 pt5          10
 9: pt3 pt1         -17
10: pt3 pt2         -12
11: pt3 pt4          -4
12: pt3 pt5          -2
13: pt4 pt1         -13
14: pt4 pt2          -8
15: pt4 pt3           4
16: pt4 pt5           2
17: pt5 pt1         -15
18: pt5 pt2         -10
19: pt5 pt3           2
20: pt5 pt4          -2

CodePudding user response:

Another option is to use combn to get all the combinations and then map out the subtractions.

library(tidyverse)

data.frame(t(combn(example$patient, 2))) |>
  mutate(obs_val_diff = map2_dbl(X1, X2, ~example[example$patient ==.x, "obs_val"] -
                                   example[example$patient ==.y, "obs_val"])) |>
  unite(test, X1, X2, sep = "-") |>
  pivot_wider(names_from = test, values_from = obs_val_diff)
#> # A tibble: 1 x 10
#>   `pt1-pt2` `pt1-pt3` `pt1-pt4` pt1-pt~1 pt2-p~2 pt2-p~3 pt2-p~4 pt3-p~5 pt3-p~6
#>       <dbl>     <dbl>     <dbl>    <dbl>   <dbl>   <dbl>   <dbl>   <dbl>   <dbl>
#> 1         5        17        13       15      12       8      10      -4      -2
#> # ... with 1 more variable: `pt4-pt5` <dbl>, and abbreviated variable names
#> #   1: `pt1-pt5`, 2: `pt2-pt3`, 3: `pt2-pt4`, 4: `pt2-pt5`, 5: `pt3-pt4`,
#> #   6: `pt3-pt5`

or in base R:


apply(t(combn(example$patient, 2)), 1, 
      \(x) -diff(example[example$patient %in% x, "obs_val"])) |>
  (\(v) matrix(v, ncol = length(v)))() |>
  as.data.frame() |>
  `colnames<-`(apply(t(combn(example$patient, 2)), 1, 
                     \(x) paste(x, collapse = "-")))
#>   pt1-pt2 pt1-pt3 pt1-pt4 pt1-pt5 pt2-pt3 pt2-pt4 pt2-pt5 pt3-pt4 pt3-pt5
#> 1       5      17      13      15      12       8      10      -4      -2
#>   pt4-pt5
#> 1       2
  • Related