How to check in a column whether dates are one off from each other
CodePudding user response:
With tidyverse
you can group_by
both the id
as well as a second id id2
that would group the rows together that are separated by a difference of one day. Then, the number of consecutive days column would include the difference between the last
date and the first
date
(or zero if not the first
date
).
library(tidyverse)
data %>%
mutate(date = as.Date(date, format = "%m-%d-%Y")) %>%
arrange(id, date) %>%
group_by(id) %>%
group_by(id2 = cumsum(c(T, diff(date) > 1)), .add = T) %>%
mutate(num_con_days = ifelse(date == first(date), last(date) - date 1, 0)) %>%
ungroup %>%
select(-id2)
Output
day id date num_con_days
<dbl> <dbl> <date> <dbl>
1 1 10 2021-01-01 1
2 3 10 2021-01-03 4
3 4 10 2021-01-04 0
4 5 10 2021-01-05 0
5 6 10 2021-01-06 0
6 1 24 2021-01-01 2
7 2 24 2021-01-02 0
8 4 24 2021-01-04 3
9 5 24 2021-01-05 0
10 6 24 2021-01-06 0
Edit: Using your other example, with renamed column names, you have the following data.frame:
id day num_consecutive_days
1 1 2021-01-02 1
2 2 2021-01-02 1
3 2 2021-01-05 2
4 2 2021-01-06 0
5 2 2021-01-12 1
6 3 2021-01-01 2
7 3 2021-01-02 0
8 3 2021-01-04 1
9 3 2021-01-11 1
10 4 2021-01-01 1
Here, your day
is in Year-Month-Day format (so in converting to a Date, you don't need to provide a separate format).
Also, you will need to make sure your column names match and are consistent for day
. See below similar code - this should be the same as your desired output.
df %>%
mutate(day = as.Date(day)) %>%
arrange(id, day) %>%
group_by(id) %>%
group_by(id2 = cumsum(c(T, diff(day) > 1)), .add = T) %>%
mutate(num_con_days = ifelse(day == first(day), last(day) - day 1, 0)) %>%
ungroup %>%
select(-id2)
Output
id day num_consecutive_days
1 1 2021-01-02 1
2 2 2021-01-02 1
3 2 2021-01-05 2
4 2 2021-01-06 0
5 2 2021-01-12 1
6 3 2021-01-01 2
7 3 2021-01-02 0
8 3 2021-01-04 1
9 3 2021-01-11 1
10 4 2021-01-01 1