I am in need of a column with "lagged differences" between specific comparison. In the data structure I have some case repetitions and I can't just assign lag =1. I also need the output with all cases filled and with the distinct direction (sign) of the comparison.
Take a look at the structure:
df <- structure(
list(
id = c(1, 1, 2, 2, 3, 3, 3, 4, 4, 4),
H_A = c("H",
"A", "H", "A", "H", "H", "A", "H", "A", "A"),
Rk = c(6, 15, 19,
7, 8, 8, 10, 12, 3, 3)
),
row.names = c(NA,-10L),
class = c("tbl_df",
"tbl", "data.frame")
)
I need this output:
The closest output I could get was using:
df %>%
group_by(id) %>%
mutate(Rk_diff = Rk-Rk[match('A', H_A)]) %>%
ungroup
CodePudding user response:
We could replace
the 0's in the next step
library(dplyr)
df %>%
group_by(id) %>%
mutate(Rk_diff = Rk - Rk[match('A', H_A)],
Rk_diff = replace(Rk_diff, Rk_diff == 0, -1 * (Rk_diff[H_A != 'A'][1]))) %>%
ungroup
-output
# A tibble: 10 × 4
id H_A Rk Rk_diff
<dbl> <chr> <dbl> <dbl>
1 1 H 6 -9
2 1 A 15 9
3 2 H 19 12
4 2 A 7 -12
5 3 H 8 -2
6 3 H 8 -2
7 3 A 10 2
8 4 H 12 9
9 4 A 3 -9
10 4 A 3 -9
Or another option is coalesce
library(tidyr)
df %>%
group_by(id) %>%
mutate(Rk_diff = case_when(H_A != 'A' ~ Rk - Rk[match('A', H_A)]),
Rk_diff2 = -1 * Rk_diff) %>%
fill(Rk_diff2) %>%
ungroup %>%
mutate(Rk_diff = coalesce(Rk_diff, Rk_diff2), Rk_diff2 = NULL)
-output
# A tibble: 10 × 4
id H_A Rk Rk_diff
<dbl> <chr> <dbl> <dbl>
1 1 H 6 -9
2 1 A 15 9
3 2 H 19 12
4 2 A 7 -12
5 3 H 8 -2
6 3 H 8 -2
7 3 A 10 2
8 4 H 12 9
9 4 A 3 -9
10 4 A 3 -9