Home > Mobile >  Create new columns by adding a sequence of months to a date column
Create new columns by adding a sequence of months to a date column

Time:01-28

I have data with a date column:

library(data.table)
Y = data.table(X = c("2012-12-31", "2021-10-31"))

            X  
1: 2012-12-31
2: 2021-10-31                    

I want to add a sequence of months, from 1 to x, to the dates in column "X". The result should be a set of new columns, "X_1", "X_2", ... "X_x":

            X          X_1          X_2    X_x
1: 2012-12-31   2013-01-31   2013-02-29 .. (2012-12-31) %m % months(x)
2: 2021-10-31   2021-11-30   2021-12-31 .. (2021-10-31) %m % months(x)

I've tried several things that didn't work, such as:

  Y[ , c(paste0("X_", 1:15))] = AddMonths(Y$X, c(1:15)) %>% LastDayInMonth()
  Y[ , c(paste0("X_", 1:15)) := AddMonths(as.IDate(X), c(1:15)) %>% LastDayInMonth()]

I want to create 15 columns based on this operation, so I want to avoid for loops. My real data have 80K rows.

CodePudding user response:

As the question is tagged with dplyr I assume tidyverse is allowed, too.

Starting with a list for the 2 dates you have.

A function taking a date as input adds from starting month as many months as choosen in the additional argument.

The function is then applied with map to the list of initial dates. This is then transformed into a data.frame.

  • Example shows only n=6
  • Contains loop
library(tidyverse)
library(lubridate)

f <- function(x, n = 5) {
  yy <- x
  for (i in seq(n)) {
    yy <- append(yy, x %m % months(i))
  }
  return(yy)
}

l <- list(("2012-12-31"), ("2021-10-31"))
l |>
  map(lubridate::as_date) |>
  map(n = 6, f) |>
  as.data.frame() |>
  setNames(l) |>
  t() |> as.data.frame()
#>                    V1         V2         V3         V4         V5         V6
#> 2012-12-31 2012-12-31 2013-01-31 2013-02-28 2013-03-31 2013-04-30 2013-05-31
#> 2021-10-31 2021-10-31 2021-11-30 2021-12-31 2022-01-31 2022-02-28 2022-03-31
#>                    V7
#> 2012-12-31 2013-06-30
#> 2021-10-31 2022-04-30

CodePudding user response:

Both %m % and months are vectorized.

library(data.table)
library(lubridate)

n = 3 # or whatever desired number of columns to create
nm = paste0("x", "_", seq(n))
m = rep(seq(n), each = nrow(d))
d[ , (nm) := split(x %m % months(m), m)]
d
#             x        x_1        x_2        x_3
# 1: 2012-12-31 2013-01-31 2013-02-28 2013-03-31
# 2: 2021-10-31 2021-11-30 2021-12-31 2022-01-31

d = data.table(x = as.Date(c("2012-12-31","2021-10-31")))

CodePudding user response:

In a for loop we can do it like this:

library(lubridate) #%m %
Y=data.table("X"=as.Date(c("2012-12-31","2021-10-31")))

for (i in 2:15) {
  Y <- Y %>% 
    mutate(col= X %m % months(1))
  colnames(Y)[i] <- paste0("X_",i)
}
            X        X_2        X_3        X_4        X_5        X_6        X_7        X_8        X_9       X_10       X_11       X_12       X_13
1: 2012-12-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31
2: 2021-10-31 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30
         X_14       X_15
1: 2013-01-31 2013-01-31
2: 2021-11-30 2021-11-30
  • Related