I have data with a date column:
library(data.table)
Y = data.table(X = c("2012-12-31", "2021-10-31"))
X
1: 2012-12-31
2: 2021-10-31
I want to add a sequence of months, from 1 to x
, to the dates in column "X". The result should be a set of new columns, "X_1", "X_2", ... "X_x":
X X_1 X_2 X_x
1: 2012-12-31 2013-01-31 2013-02-29 .. (2012-12-31) %m % months(x)
2: 2021-10-31 2021-11-30 2021-12-31 .. (2021-10-31) %m % months(x)
I've tried several things that didn't work, such as:
Y[ , c(paste0("X_", 1:15))] = AddMonths(Y$X, c(1:15)) %>% LastDayInMonth()
Y[ , c(paste0("X_", 1:15)) := AddMonths(as.IDate(X), c(1:15)) %>% LastDayInMonth()]
I want to create 15 columns based on this operation, so I want to avoid for
loops. My real data have 80K rows.
CodePudding user response:
As the question is tagged with dplyr
I assume tidyverse
is allowed, too.
Starting with a list for the 2 dates you have.
A function taking a date
as input adds from starting month as many months as choosen in the additional argument.
The function is then applied with map
to the list of initial dates. This is then transformed into a data.frame
.
- Example shows only
n=6
- Contains loop
library(tidyverse)
library(lubridate)
f <- function(x, n = 5) {
yy <- x
for (i in seq(n)) {
yy <- append(yy, x %m % months(i))
}
return(yy)
}
l <- list(("2012-12-31"), ("2021-10-31"))
l |>
map(lubridate::as_date) |>
map(n = 6, f) |>
as.data.frame() |>
setNames(l) |>
t() |> as.data.frame()
#> V1 V2 V3 V4 V5 V6
#> 2012-12-31 2012-12-31 2013-01-31 2013-02-28 2013-03-31 2013-04-30 2013-05-31
#> 2021-10-31 2021-10-31 2021-11-30 2021-12-31 2022-01-31 2022-02-28 2022-03-31
#> V7
#> 2012-12-31 2013-06-30
#> 2021-10-31 2022-04-30
CodePudding user response:
Both %m %
and months
are vectorized.
library(data.table)
library(lubridate)
n = 3 # or whatever desired number of columns to create
nm = paste0("x", "_", seq(n))
m = rep(seq(n), each = nrow(d))
d[ , (nm) := split(x %m % months(m), m)]
d
# x x_1 x_2 x_3
# 1: 2012-12-31 2013-01-31 2013-02-28 2013-03-31
# 2: 2021-10-31 2021-11-30 2021-12-31 2022-01-31
d = data.table(x = as.Date(c("2012-12-31","2021-10-31")))
CodePudding user response:
In a for loop we can do it like this:
library(lubridate) #%m %
Y=data.table("X"=as.Date(c("2012-12-31","2021-10-31")))
for (i in 2:15) {
Y <- Y %>%
mutate(col= X %m % months(1))
colnames(Y)[i] <- paste0("X_",i)
}
X X_2 X_3 X_4 X_5 X_6 X_7 X_8 X_9 X_10 X_11 X_12 X_13
1: 2012-12-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31 2013-01-31
2: 2021-10-31 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30 2021-11-30
X_14 X_15
1: 2013-01-31 2013-01-31
2: 2021-11-30 2021-11-30