Home > Software engineering >  Create columns based on date
Create columns based on date

Time:10-13

case <- c("A","A","A","B","B","C","C","C","C")
date <- c("2022-01-01","2022-01-08","2022-06-07","2022-05-08","2022-03-06","2022-09-08","2022-09-23","2022-12-08","2022-06-05")
df <- data.frame(case,date)

I have a dataframe that looks like this:

# A tibble: 9 x 2
  case  date      
  <chr> <chr>     
1 A     2022-01-01
2 A     2022-01-08
3 A     2022-06-07
4 B     2022-05-08
5 B     2022-03-06
6 C     2022-09-08
7 C     2022-09-23
8 C     2022-12-08
9 C     2022-06-05

I would like to essentially pivot_wider the rows based on date where the earliest date would become instance_1, next instance_2 and so far. I have tried the pivot_wider function but can't get the syntax right. Any help is appreciated.

CodePudding user response:

We need a sequence column by 'case' and then do pivot_wider

library(tidyr)
library(dplyr)
library(data.table)
library(stringr)
df %>%
  arrange(case, date) %>%
   mutate(cn = str_c('instance_', rowid(case))) %>%
   pivot_wider(names_from = cn, values_from = date)

-output

# A tibble: 3 × 5
  case  instance_1 instance_2 instance_3 instance_4
  <chr> <chr>      <chr>      <chr>      <chr>     
1 A     2022-01-01 2022-01-08 2022-06-07 <NA>      
2 B     2022-03-06 2022-05-08 <NA>       <NA>      
3 C     2022-06-05 2022-09-08 2022-09-23 2022-12-08

Or a similar option with dcast

library(data.table)
dcast(setDT(df)[order(case, date)],
   case ~ paste0('instance_', rowid(case)), value.var = 'date')

-output

Key: <case>
     case instance_1 instance_2 instance_3 instance_4
   <char>     <char>     <char>     <char>     <char>
1:      A 2022-01-01 2022-01-08 2022-06-07       <NA>
2:      B 2022-03-06 2022-05-08       <NA>       <NA>
3:      C 2022-06-05 2022-09-08 2022-09-23 2022-12-08
  • Related