I have a very simple question: Imagine a cognitive test with 6 items, each of which is scored separately on a 0-10 scale now let's say 5 patients have taken this test at three (or more) different time of the day.
df <- tibble(
id = rep(1:5, each = 3),
time = rep(c("morning", "evening", "night"), each = 1, times = 5),
i1 = round(runif(15, 0, 10)),
i2 = round(runif(15, 0, 10)),
i3 = round(runif(15, 0, 10)),
i4 = round(runif(15, 0, 10)),
i5 = round(runif(15, 0, 10)),
i6 = round(runif(15, 0, 10))
)
Now, how can I calculate the changes in the score of each individual item for every patient between different times of the day?
tidyverse
approaches are highly appreciated.
P.S: the output should be preferably something like this:
df <- tibble(
id = rep(1:5, each = 3),
time = rep(c("morning", "evening", "night"), each = 1, times = 5),
i1 = round(runif(15, 0, 10)),
i2 = round(runif(15, 0, 10)),
i3 = round(runif(15, 0, 10)),
i4 = round(runif(15, 0, 10)),
i5 = round(runif(15, 0, 10)),
i6 = round(runif(15, 0, 10)),
diff_morning_evening = rep(NA, 15),
diff_morning_night = rep(NA, 15),
diff_evening_night = rep(NA, 15),
)
CodePudding user response:
You can use combn
to generate all combinations of 2 times, and apply diff
to each combination to calculate the change by setting FUN = diff
.
library(tidyverse)
df %>%
group_by(id) %>%
summarise(col = combn(time, 2, paste, collapse = '_'),
across(i1:i6, ~ combn(.x, 2, diff)), .groups = 'drop') %>%
pivot_wider(names_from = col, values_from = i1:i6, names_prefix = 'diff_')
# # A tibble: 5 × 19
# id i1_diff_morning_evening i1_diff_morning_night i1_diff_evening_night i2_diff_morning_evening i2_diff_morning_night i2_diff_evening_night
# <int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 -8 -1 7 -1 1 2
# 2 2 6 0 -6 -2 6 8
# 3 3 -9 -7 2 -1 3 4
# 4 4 3 0 -3 -9 0 9
# 5 5 0 -3 -3 -1 7 8
#
# … with 12 more variables:
# i3_diff_morning_evening <dbl>, i3_diff_morning_night <dbl>, i3_diff_evening_night <dbl>,
# i4_diff_morning_evening <dbl>, i4_diff_morning_night <dbl>, i4_diff_evening_night <dbl>,
# i5_diff_morning_evening <dbl>, i5_diff_morning_night <dbl>, i5_diff_evening_night <dbl>,
# i6_diff_morning_evening <dbl>, i6_diff_morning_night <dbl>, i6_diff_evening_night <dbl>
CodePudding user response:
An approach calculating the changes in a compact way:
- 1st row: morning - night
- 2nd row: morning - evening
- 3rd row: evening - night
df %>%
group_by(id) %>%
mutate(across(i1:i6, ~ ifelse(is.na(.x - lag(.x)),
lead(.x, 2) - .x, .x - lag(.x)),
.names="{.col}_chg")) %>%
ungroup()
# A tibble: 15 × 14
id time i1 i2 i3 i4 i5 i6 i1_chg i2_chg i3_chg i4_chg
<int> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 morning 1 2 4 9 5 6 7 -1 1 -8
2 1 evening 5 0 4 8 9 4 4 -2 0 -1
3 1 night 8 1 5 1 4 5 3 1 1 -7
4 2 morning 7 7 4 3 2 9 -5 -1 4 5
5 2 evening 8 9 1 2 4 5 1 2 -3 -1
6 2 night 2 6 8 8 10 2 -6 -3 7 6
7 3 morning 9 6 6 1 5 5 -8 -1 2 0
8 3 evening 3 2 8 1 3 10 -6 -4 2 0
9 3 night 1 5 8 1 3 3 -2 3 0 0
10 4 morning 7 2 9 1 5 8 1 1 -6 0
11 4 evening 3 5 9 5 6 3 -4 3 0 4
12 4 night 8 3 3 1 3 2 5 -2 -6 -4
13 5 morning 4 1 3 7 1 0 4 6 4 2
14 5 evening 7 2 7 7 5 2 3 1 4 0
15 5 night 8 7 7 9 8 4 1 5 0 2
# … with 2 more variables: i5_chg <dbl>, i6_chg <dbl>
Summing special rows approach
df %>%
group_by(id) %>%
summarize(across(i1:i6, ~
.x[which(time == "morning")] .x[which(time == "night")]))
# A tibble: 5 × 7
id i1 i2 i3 i4 i5 i6
<int> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1 9 3 9 10 9 11
2 2 9 13 12 11 12 11
3 3 10 11 14 2 8 8
4 4 15 5 12 2 8 10
5 5 12 8 10 16 9 4