I have a dataframe and I would like to create a counter that increment of 1 for each record (for a same ID/ID2 group) except if the next record is at the same date or the day after. it would be like this: (just to create thecounter)
ID | ID2 | date | counter |
---|---|---|---|
1 | a | 01/01/2021 | 1 |
1 | a | 02/01/2021 | 1 |
1 | a | 05/01/2021 | 2 |
1 | b | 06/01/2021 | 1 |
1 | b | 10/01/2021 | 2 |
1 | b | 11/01/2021 | 2 |
2 | a | 01/01/2021 | 1 |
2 | a | 05/01/2021 | 2 |
2 | a | 06/01/2021 | 2 |
2 | b | 10/01/2021 | 1 |
2 | c | 11/01/2021 | 1 |
2 | c | 11/01/2021 | 1 |
2 | c | 16/01/2021 | 2 |
2 | c | 24/01/2021 | 3 |
I was trying to use dplyr with row_number() but I couldn't figure out how to add the condition not to increment the counter when gap of days are 0 or 1. Any help would be useful, thanks a lot!
CodePudding user response:
Maybe you can try this with tidyverse
. After grouping by both ID
and ID2
, use cumsum
where the difference in date
is greater than 1.
library(tidyverse)
df %>%
mutate(date = as.Date(date, format = "%d/%m/%Y")) %>%
arrange(date) %>%
group_by(ID, ID2) %>%
mutate(new_counter = c(1, cumsum(diff(date) > 1) 1)) %>%
arrange(ID, ID2)
Output
ID ID2 date counter new_counter
<int> <chr> <date> <int> <dbl>
1 1 a 2021-01-01 1 1
2 1 a 2021-01-02 1 1
3 1 a 2021-01-05 2 2
4 1 b 2021-01-06 1 1
5 1 b 2021-01-10 2 2
6 1 b 2021-01-11 2 2
7 2 a 2021-01-01 1 1
8 2 a 2021-01-05 2 2
9 2 a 2021-01-06 2 2
10 2 b 2021-01-10 1 1
11 2 c 2021-01-11 1 1
12 2 c 2021-01-11 1 1
13 2 c 2021-01-16 2 2
14 2 c 2021-01-24 3 3
CodePudding user response:
df |>
mutate(date = as.Date(date, format = "%d/%m/%y")) |>
group_by(ID, ID2) |>
mutate(counter = purrr::accumulate(date, ~{
if(.y - .x < 2)
.x
else
.y
}) |> dense_rank())
# A tibble: 14 × 4
# Groups: ID, ID2 [5]
ID ID2 date counter
<dbl> <chr> <date> <int>
1 1 "a " 2020-01-01 1
2 1 "a " 2020-01-02 1
3 1 "a " 2020-01-05 2
4 1 "b " 2020-01-06 1
5 1 "b " 2020-01-10 2
6 1 "b " 2020-01-11 2
7 2 "a " 2020-01-01 1
8 2 "a " 2020-01-05 2
9 2 "a " 2020-01-06 2
10 2 "b " 2020-01-10 1
11 2 "c " 2020-01-11 1
12 2 "c " 2020-01-11 1
13 2 "c " 2020-01-16 2
14 2 "c " 2020-01-24 3