I have a data frame and a vector that I want to compare with a column of my data frame to assign groups based on the values that meet the condition, the problem is that these values are dynamic so I need a code that takes into account the different lengths that this vector can take
This is a minimal reproducible example of my data frame
value <- c(rnorm(39, 5, 2))
Date <- seq(as.POSIXct('2021-01-18'), as.POSIXct('2021-10-15'), by = "7 days")
df <- data.frame(Date, value)
This is the vector I have to compare with the Date of the data frame
dates_tour <- as.POSIXct(c('2021-01-18', '2021-05-18', '2021-08-18', '2021-10-15'))
This creates the desired output
df <- df %>% mutate(tour = case_when(Date >= dates_tour[1] & Date <= dates_tour[2] ~ 1,
Date > dates_tour[2] & Date <= dates_tour[3]~2,
Date > dates_tour[3] & Date <= dates_tour[4]~3))
However, I don't want to do it like that since this project needs to be updated frequently and the variable dates_tour
change in length
So I would like to take that into account to create the tour variable
I tried to do it like this: but it doesn't work
for (i in 1:length(dates_tour)) {
df <- df %>% mutate(tour = case_when(Date >= dates_tour[i] & Date <= dates_tour[i 1] ~ i))
}
CodePudding user response:
You can use cut
to bin a vector based on break points:
df %>%
mutate(
tour = cut(Date, breaks = dates_tour, labels = seq_along(dates_tour[-1]))
)
CodePudding user response:
We may remove the first and last elements to create a tibble and then loop over the rows of the tibble
library(dplyr)
library(purrr)
keydat <- tibble(start = dates_tour[-length(dates_tour)],
end = dates_tour[-1])
df$tour <- imap(seq_len(nrow(keydat)),
~ case_when(df$Date >= keydat$start[.x] &
df$Date <= keydat$end[.x]~ .y )) %>%
invoke(coalesce, .)
-output
> df
Date value tour
1 2021-01-18 00:00:00 7.874620 1
2 2021-01-25 00:00:00 9.704973 1
3 2021-02-01 00:00:00 5.898070 1
4 2021-02-08 00:00:00 3.287319 1
5 2021-02-15 00:00:00 5.488132 1
6 2021-02-22 00:00:00 4.425636 1
7 2021-03-01 00:00:00 6.244084 1
8 2021-03-08 00:00:00 5.528364 1
9 2021-03-15 01:00:00 7.954929 1
10 2021-03-22 01:00:00 4.691995 1
11 2021-03-29 01:00:00 5.943415 1
12 2021-04-05 01:00:00 5.316373 1
13 2021-04-12 01:00:00 5.182952 1
14 2021-04-19 01:00:00 3.330700 1
15 2021-04-26 01:00:00 7.461089 1
16 2021-05-03 01:00:00 4.338873 1
17 2021-05-10 01:00:00 5.768665 1
18 2021-05-17 01:00:00 3.574488 1
19 2021-05-24 01:00:00 5.106042 2
20 2021-05-31 01:00:00 2.828844 2
21 2021-06-07 01:00:00 4.616084 2
22 2021-06-14 01:00:00 7.234506 2
23 2021-06-21 01:00:00 4.760413 2
24 2021-06-28 01:00:00 7.020543 2
25 2021-07-05 01:00:00 7.403235 2
26 2021-07-12 01:00:00 6.368435 2
27 2021-07-19 01:00:00 3.527764 2
28 2021-07-26 01:00:00 5.254025 2
29 2021-08-02 01:00:00 5.676425 2
30 2021-08-09 01:00:00 3.783304 2
31 2021-08-16 01:00:00 6.310292 2
32 2021-08-23 01:00:00 2.938218 3
33 2021-08-30 01:00:00 5.101852 3
34 2021-09-06 01:00:00 3.765659 3
35 2021-09-13 01:00:00 5.489846 3
36 2021-09-20 01:00:00 4.174276 3
37 2021-09-27 01:00:00 7.348895 3
38 2021-10-04 01:00:00 5.103772 3
39 2021-10-11 01:00:00 4.941248 3