Calculating score change at two or more time point-CodePudding

I have a very simple question: Imagine a cognitive test with 6 items, each of which is scored separately on a 0-10 scale now let's say 5 patients have taken this test at three (or more) different time of the day.

df <- tibble(
  id = rep(1:5, each = 3),
  time = rep(c("morning", "evening", "night"), each = 1, times = 5),
  i1 = round(runif(15, 0, 10)),
  i2 = round(runif(15, 0, 10)),
  i3 = round(runif(15, 0, 10)),
  i4 = round(runif(15, 0, 10)),
  i5 = round(runif(15, 0, 10)),
  i6 = round(runif(15, 0, 10))
)

Now, how can I calculate the changes in the score of each individual item for every patient between different times of the day? tidyverse approaches are highly appreciated.

P.S: the output should be preferably something like this:

df <- tibble(
  id = rep(1:5, each = 3),
  time = rep(c("morning", "evening", "night"), each = 1, times = 5),
  i1 = round(runif(15, 0, 10)),
  i2 = round(runif(15, 0, 10)),
  i3 = round(runif(15, 0, 10)),
  i4 = round(runif(15, 0, 10)),
  i5 = round(runif(15, 0, 10)),
  i6 = round(runif(15, 0, 10)),
  diff_morning_evening = rep(NA, 15),
  diff_morning_night = rep(NA, 15),
  diff_evening_night = rep(NA, 15),
)

CodePudding user response：

You can use combn to generate all combinations of 2 times, and apply diff to each combination to calculate the change by setting FUN = diff.

library(tidyverse)

df %>% 
  group_by(id) %>% 
  summarise(col = combn(time, 2, paste, collapse = '_'),
            across(i1:i6, ~ combn(.x, 2, diff)), .groups = 'drop') %>%
  pivot_wider(names_from = col, values_from = i1:i6, names_prefix = 'diff_')

# # A tibble: 5 × 19
#      id i1_diff_morning_evening i1_diff_morning_night i1_diff_evening_night i2_diff_morning_evening i2_diff_morning_night i2_diff_evening_night
#   <int>                   <dbl>                 <dbl>                 <dbl>                   <dbl>                 <dbl>                 <dbl>
# 1     1                      -8                    -1                     7                      -1                     1                     2
# 2     2                       6                     0                    -6                      -2                     6                     8
# 3     3                      -9                    -7                     2                      -1                     3                     4
# 4     4                       3                     0                    -3                      -9                     0                     9
# 5     5                       0                    -3                    -3                      -1                     7                     8
#
# … with 12 more variables:
# i3_diff_morning_evening <dbl>, i3_diff_morning_night <dbl>, i3_diff_evening_night <dbl>,
# i4_diff_morning_evening <dbl>, i4_diff_morning_night <dbl>, i4_diff_evening_night <dbl>,
# i5_diff_morning_evening <dbl>, i5_diff_morning_night <dbl>, i5_diff_evening_night <dbl>,
# i6_diff_morning_evening <dbl>, i6_diff_morning_night <dbl>, i6_diff_evening_night <dbl>

CodePudding user response：

An approach calculating the changes in a compact way:

1st row: morning - night
2nd row: morning - evening
3rd row: evening - night

df %>% 
  group_by(id) %>% 
  mutate(across(i1:i6, ~ ifelse(is.na(.x - lag(.x)), 
                           lead(.x, 2) - .x, .x - lag(.x)), 
           .names="{.col}_chg")) %>% 
  ungroup()
# A tibble: 15 × 14
      id time       i1    i2    i3    i4    i5    i6 i1_chg i2_chg i3_chg i4_chg
   <int> <chr>   <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>  <dbl>  <dbl>  <dbl>  <dbl>
 1     1 morning     1     2     4     9     5     6      7     -1      1     -8
 2     1 evening     5     0     4     8     9     4      4     -2      0     -1
 3     1 night       8     1     5     1     4     5      3      1      1     -7
 4     2 morning     7     7     4     3     2     9     -5     -1      4      5
 5     2 evening     8     9     1     2     4     5      1      2     -3     -1
 6     2 night       2     6     8     8    10     2     -6     -3      7      6
 7     3 morning     9     6     6     1     5     5     -8     -1      2      0
 8     3 evening     3     2     8     1     3    10     -6     -4      2      0
 9     3 night       1     5     8     1     3     3     -2      3      0      0
10     4 morning     7     2     9     1     5     8      1      1     -6      0
11     4 evening     3     5     9     5     6     3     -4      3      0      4
12     4 night       8     3     3     1     3     2      5     -2     -6     -4
13     5 morning     4     1     3     7     1     0      4      6      4      2
14     5 evening     7     2     7     7     5     2      3      1      4      0
15     5 night       8     7     7     9     8     4      1      5      0      2
# … with 2 more variables: i5_chg <dbl>, i6_chg <dbl>

Summing special rows approach

df %>% 
  group_by(id) %>% 
  summarize(across(i1:i6, ~ 
    .x[which(time == "morning")]   .x[which(time == "night")]))
# A tibble: 5 × 7
     id    i1    i2    i3    i4    i5    i6
  <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1     1     9     3     9    10     9    11
2     2     9    13    12    11    12    11
3     3    10    11    14     2     8     8
4     4    15     5    12     2     8    10
5     5    12     8    10    16     9     4