case <- c("A","A","A","B","B","C","C","C","C")
date <- c("2022-01-01","2022-01-08","2022-06-07","2022-05-08","2022-03-06","2022-09-08","2022-09-23","2022-12-08","2022-06-05")
df <- data.frame(case,date)
I have a dataframe that looks like this:
# A tibble: 9 x 2
case date
<chr> <chr>
1 A 2022-01-01
2 A 2022-01-08
3 A 2022-06-07
4 B 2022-05-08
5 B 2022-03-06
6 C 2022-09-08
7 C 2022-09-23
8 C 2022-12-08
9 C 2022-06-05
I would like to essentially pivot_wider the rows based on date where the earliest date would become instance_1, next instance_2 and so far. I have tried the pivot_wider function but can't get the syntax right. Any help is appreciated.
CodePudding user response:
We need a sequence column by 'case' and then do pivot_wider
library(tidyr)
library(dplyr)
library(data.table)
library(stringr)
df %>%
arrange(case, date) %>%
mutate(cn = str_c('instance_', rowid(case))) %>%
pivot_wider(names_from = cn, values_from = date)
-output
# A tibble: 3 × 5
case instance_1 instance_2 instance_3 instance_4
<chr> <chr> <chr> <chr> <chr>
1 A 2022-01-01 2022-01-08 2022-06-07 <NA>
2 B 2022-03-06 2022-05-08 <NA> <NA>
3 C 2022-06-05 2022-09-08 2022-09-23 2022-12-08
Or a similar option with dcast
library(data.table)
dcast(setDT(df)[order(case, date)],
case ~ paste0('instance_', rowid(case)), value.var = 'date')
-output
Key: <case>
case instance_1 instance_2 instance_3 instance_4
<char> <char> <char> <char> <char>
1: A 2022-01-01 2022-01-08 2022-06-07 <NA>
2: B 2022-03-06 2022-05-08 <NA> <NA>
3: C 2022-06-05 2022-09-08 2022-09-23 2022-12-08