**** updated as I missed a valuable bit of information that the date records are not unique in the dataset.****
I'm sure this is straightforward but I can't figure out how to do this without a manual lookup table of individual dates and references.
I have a table of dates (y/m/d) and several date ranges.
My dates look like this:
Date |
---|
2022/4/11 |
2022/4/12 |
2022/4/13 |
2022/4/13 |
2022/4/13 |
2022/4/13 |
2022/4/13 |
2022/4/14 |
2022/4/14 |
2022/4/14 |
2022/4/15 |
2022/4/16 |
2022/4/16 |
2022/4/16 |
My date range looks like this:
Start.date | Completed.date |
---|---|
2022/4/13 | 2022/4/16 |
How do I add a column to show the date series? My output should look like:
Date | Date.number |
---|---|
2022/4/11 | |
2022/4/12 | |
2022/4/13 | 1 |
2022/4/13 | 1 |
2022/4/13 | 1 |
2022/4/13 | 1 |
2022/4/13 | 1 |
2022/4/14 | 2 |
2022/4/14 | 2 |
2022/4/14 | 2 |
2022/4/15 | 3 |
2022/4/16 | 4 |
2022/4/16 | 4 |
2022/4/16 | 4 |
2022/4/17 | |
2022/4/17 | |
2022/4/18 |
Advanced thanks for any help.
Code for tables:
Date.range <- structure(list(Start.date = "2022/4/13", Completed.date = "2022/4/16"), class = "data.frame", row.names = c(NA,-1L))
Date <- structure(list(Date = c("2022/4/11", "2022/4/12", "2022/4/13",
"2022/4/14", "2022/4/14","2022/4/14","2022/4/15", "2022/4/16","2022/4/16","2022/4/16",
"2022/4/17","2022/4/17","2022/4/18")), class = "data.frame", row.names = c(NA,-6L))
CodePudding user response:
A possible solution:
library(dplyr)
library(lubridate)
Date.range <- structure(list(Start.date = "2022/4/13", Completed.date = "2022/4/16"), class = "data.frame", row.names = c(NA,-1L))
Date <- data.frame(
Date = c("2022/4/11","2022/4/12","2022/4/13","2022/4/13",
"2022/4/13","2022/4/13","2022/4/13","2022/4/14",
"2022/4/14","2022/4/14","2022/4/15","2022/4/16",
"2022/4/16","2022/4/16"))
Date %>%
mutate(Day.number = ymd(Date) >= ymd(Date.range$Start.date) & ymd(Date) <= ymd(Date.range$Completed.date)) %>%
group_by(Date) %>%
mutate(Day.number = if_else(Day.number, cur_group_id(), NA_integer_)) %>%
ungroup %>%
mutate(Day.number = Day.number - min(Day.number, na.rm = T) 1)
#> # A tibble: 14 × 2
#> Date Day.number
#> <chr> <dbl>
#> 1 2022/4/11 NA
#> 2 2022/4/12 NA
#> 3 2022/4/13 1
#> 4 2022/4/13 1
#> 5 2022/4/13 1
#> 6 2022/4/13 1
#> 7 2022/4/13 1
#> 8 2022/4/14 2
#> 9 2022/4/14 2
#> 10 2022/4/14 2
#> 11 2022/4/15 3
#> 12 2022/4/16 4
#> 13 2022/4/16 4
#> 14 2022/4/16 4