Home > Software engineering >  Fill rows of a data frame by group in R
Fill rows of a data frame by group in R

Time:04-22

I would like to transform this data frame:

id date X
001 2022-04-12 dsf
001 2022-04-13 df
001 2022-04-14 sdw
002 2022-04-12 we3
002 2022-04-13 r45
002 2022-04-14 2s
003 2022-04-12 de65
003 2022-04-13 hjk
004 2022-04-12 vcbdf
005 2022-04-12 342f

to this one:

id date X
001 2022-04-12 dsf
001 2022-04-13 df
001 2022-04-14 sdw
002 2022-04-12 we3
002 2022-04-13 r45
002 2022-04-14 2s
003 2022-04-12 de65
003 2022-04-13 hjk
003 2022-04-14 NA
004 2022-04-12 vcbdf
004 2022-04-13 NA
004 2022-04-14 NA
005 2022-04-12 342f
005 2022-04-13 NA
005 2022-04-14 NA

As you can see, the date target is from 2022-04-12 to 2022-04-14, so the rows without all the dates should be filled.

Any idea?

CodePudding user response:

You can use tidyr::complete to create a sequence of dates for each id from the minimum date value to the maximum.

library(dplyr)
library(tidyr)

df %>%
  mutate(date = as.Date(date)) %>%
  complete(id, date = seq(min(date), max(date), 'days'))

# A tibble: 15 × 3
#       id date       X    
#   <int> <date>     <chr>
# 1     1 2022-04-12 dsf  
# 2     1 2022-04-13 df   
# 3     1 2022-04-14 sdw  
# 4     2 2022-04-12 we3  
# 5     2 2022-04-13 r45  
# 6     2 2022-04-14 2s   
# 7     3 2022-04-12 de65 
# 8     3 2022-04-13 hjk  
# 9     3 2022-04-14 NA   
#10     4 2022-04-12 vcbdf
#11     4 2022-04-13 NA   
#12     4 2022-04-14 NA   
#13     5 2022-04-12 342f 
#14     5 2022-04-13 NA   
#15     5 2022-04-14 NA   
  • Related