Home > Back-end >  Mutate case_when in R to create a column of time periods per participant
Mutate case_when in R to create a column of time periods per participant

Time:02-11

I have tested participants at three points in time. I have the dates at which they were tested. I want to make a column which levels are first, second, and third. Each participant has three dates, so they are all different per participant. The data looks like this:

structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L), time_tested = c("2022-02-05", "2022-02-05", "2022-02-05", 
"2022-02-08", "2022-02-08", "2022-02-08", "2022-02-11", "2022-02-11", 
"2022-02-11", "2022-02-08", "2022-02-08", "2022-02-08", "2022-02-10", 
"2022-02-10", "2022-02-10", "2022-02-13", "2022-02-13", "2022-02-13", 
"2022-02-05", "2022-02-05", "2022-02-05", "2022-02-08", "2022-02-08", 
"2022-02-08", "2022-02-11", "2022-02-11", "2022-02-11")), class = "data.frame", row.names = c(NA, 
-27L))

and this is the result I want:

structure(list(id = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 
3L), time_tested = c("2022-02-05", "2022-02-05", "2022-02-05", 
"2022-02-08", "2022-02-08", "2022-02-08", "2022-02-11", "2022-02-11", 
"2022-02-11", "2022-02-08", "2022-02-08", "2022-02-08", "2022-02-10", 
"2022-02-10", "2022-02-10", "2022-02-13", "2022-02-13", "2022-02-13", 
"2022-02-05", "2022-02-05", "2022-02-05", "2022-02-08", "2022-02-08", 
"2022-02-08", "2022-02-11", "2022-02-11", "2022-02-11"), period = c("first", 
"first", "first", "second", "second", "second", "third", "third", 
"third", "first", "first", "first", "second", "second", "second", 
"third", "third", "third", "first", "first", "first", "second", 
"second", "second", "third", "third", "third")), class = "data.frame", row.names = c(NA, 
-27L))

Thank you!

CodePudding user response:

Using data.table::rleid to get the group ids, and ordinal function from package english to convert it to ordinal.

Base R

df$period <- as.numeric(ave(df$time_tested, df$id, FUN = data.table::rleid))
df$english <- english::ordinal(df$period)

tidyverse

df %>% 
  group_by(id) %>% 
  mutate(period = data.table::rleid(time_tested), 
         english = english::ordinal(period))

output

   id time_tested period english
1   1  2022-02-05      1   first
2   1  2022-02-05      1   first
3   1  2022-02-05      1   first
4   1  2022-02-08      2  second
5   1  2022-02-08      2  second
6   1  2022-02-08      2  second
7   1  2022-02-11      3   third
8   1  2022-02-11      3   third
9   1  2022-02-11      3   third
10  2  2022-02-08      1   first
11  2  2022-02-08      1   first
12  2  2022-02-08      1   first
13  2  2022-02-10      2  second
14  2  2022-02-10      2  second
15  2  2022-02-10      2  second
16  2  2022-02-13      3   third
17  2  2022-02-13      3   third
18  2  2022-02-13      3   third
19  3  2022-02-05      1   first
20  3  2022-02-05      1   first
21  3  2022-02-05      1   first
22  3  2022-02-08      2  second
23  3  2022-02-08      2  second
24  3  2022-02-08      2  second
25  3  2022-02-11      3   third
26  3  2022-02-11      3   third
27  3  2022-02-11      3   third
  • Related