I have a list of dates and want to create a column in an existing data frame for every element of the list and in that column, I want a 1 if the date is between the start and end date.
input:
my_date = as.Date(c("2022-10-02","2022-10-31","2022-10-05"))
df1 <- data.frame(ID = c(1, 2, 3),
start_date = as.Date(c("2022-10-01","2022-10-08","2022-10-01")),
end_date = as.Date(c("2022-10-22","2022-10-22","2022-10-08")))
output:
df1 <- data.frame(ID = c(1, 2, 3),
start_date = as.Date(c("2022-10-01","2022-10-08","2022-10-01")),
end_date = as.Date(c("2022-10-22","2022-10-22","2022-10-08")),
flag_1= c(1,0,1), flag_2=c(0,0,0), flag_3= c(1,0,1))
I could do it manually, but I have to many rows. I was looking into a solution with mutate but not sure how to implement.
CodePudding user response:
Do you want something like this? I assume you want to compare each element in my_date
to the respective row in df1
and not every row.
library(tidyverse)
df1 |>
mutate(flag = pmap_dbl(list(start_date, end_date, my_date),
~` `(between(..3, ..1, ..2))) )
#> ID start_date end_date flag
#> 1 1 2022-10-01 2022-10-22 1
#> 2 2 2022-10-08 2022-10-22 0
#> 3 3 2022-10-01 2022-10-08 1
EDIT from clarification
df1 |>
mutate(flag = map2(start_date, end_date,
~pmap_dbl(list(.x, .y, my_date),
~` `(between(..3, ..1, ..2))))) |>
unnest_wider(flag, names_sep = "_")
#> # A tibble: 3 x 6
#> ID start_date end_date flag_1 flag_2 flag_3
#> <dbl> <date> <date> <dbl> <dbl> <dbl>
#> 1 1 2022-10-01 2022-10-22 1 0 1
#> 2 2 2022-10-08 2022-10-22 0 0 0
#> 3 3 2022-10-01 2022-10-08 1 0 1
CodePudding user response:
Using base:
cbind(df1,
lapply(setNames(my_date, paste0("flag_", seq_along(my_date))),
function(i) as.integer(df1$start_date <= i & df1$end_date >= i)))
# ID start_date end_date flag_1 flag_2 flag_3
# 1 1 2022-10-01 2022-10-22 1 0 1
# 2 2 2022-10-08 2022-10-22 0 0 0
# 3 3 2022-10-01 2022-10-08 1 0 1