Home > Blockchain >  Expand date in time series using years
Expand date in time series using years

Time:04-07

I have the following time series

         location  date  value
          North   199001   a
          North   199203   b
          North   199402   c
          North   199506   d
          South   198005   e
          South   198304   f

I would like to extract the years and expand the rows per group to obtain something like

         location  date value 
          North   1990    a
          North   1991    a
          North   1992    b
          North   1993    b
          North   1994    c
          North   1995    d
          South   1980    e
          South   1981    e
          South   1982    e
          South   1983    f

Note that I would like to repeat a value for the rows expanded that were not in the original dataset. I have been trying using lubridate and dplyr but I'm not being able to do it. Can anybody help me with this?

CodePudding user response:

A dplyr / tidyr solution: substract the first four digits of your date column, convert to as.numeric, group_by location, complete the years per location and fill the values:

Code

library(dplyr)
library(tidyr)

df %>% mutate(date = as.numeric(substr(date, 1, 4))) %>% 
  group_by(location) %>% 
  complete(date = full_seq(date, 1)) %>% fill(value)

Output

 1 North     1990 a    
 2 North     1991 a    
 3 North     1992 b    
 4 North     1993 b    
 5 North     1994 c    
 6 North     1995 d    
 7 South     1980 e    
 8 South     1981 e    
 9 South     1982 e    
10 South     1983 f 

Data

df <- data.frame(fread("location  date  value
North   199001   a
North   199203   b
North   199402   c
North   199506   d
South   198005   e
South   198304   f"))
  • Related