Home > Net >  Conditional Incremental counter in R
Conditional Incremental counter in R

Time:03-15

I have a dataframe and I would like to create a counter that increment of 1 for each record (for a same ID/ID2 group) except if the next record is at the same date or the day after. it would be like this: (just to create thecounter)

ID ID2 date counter
1 a 01/01/2021 1
1 a 02/01/2021 1
1 a 05/01/2021 2
1 b 06/01/2021 1
1 b 10/01/2021 2
1 b 11/01/2021 2
2 a 01/01/2021 1
2 a 05/01/2021 2
2 a 06/01/2021 2
2 b 10/01/2021 1
2 c 11/01/2021 1
2 c 11/01/2021 1
2 c 16/01/2021 2
2 c 24/01/2021 3

I was trying to use dplyr with row_number() but I couldn't figure out how to add the condition not to increment the counter when gap of days are 0 or 1. Any help would be useful, thanks a lot!

CodePudding user response:

Maybe you can try this with tidyverse. After grouping by both ID and ID2, use cumsum where the difference in date is greater than 1.

library(tidyverse)

df %>%
  mutate(date = as.Date(date, format = "%d/%m/%Y")) %>%
  arrange(date) %>%
  group_by(ID, ID2) %>%
  mutate(new_counter = c(1, cumsum(diff(date) > 1)   1)) %>%
  arrange(ID, ID2)

Output

      ID ID2   date       counter new_counter
   <int> <chr> <date>       <int>       <dbl>
 1     1 a     2021-01-01       1           1
 2     1 a     2021-01-02       1           1
 3     1 a     2021-01-05       2           2
 4     1 b     2021-01-06       1           1
 5     1 b     2021-01-10       2           2
 6     1 b     2021-01-11       2           2
 7     2 a     2021-01-01       1           1
 8     2 a     2021-01-05       2           2
 9     2 a     2021-01-06       2           2
10     2 b     2021-01-10       1           1
11     2 c     2021-01-11       1           1
12     2 c     2021-01-11       1           1
13     2 c     2021-01-16       2           2
14     2 c     2021-01-24       3           3

CodePudding user response:

df |>
  mutate(date = as.Date(date, format = "%d/%m/%y")) |>
  group_by(ID, ID2) |>
  mutate(counter = purrr::accumulate(date, ~{
    if(.y - .x < 2)
      .x
    else
      .y
  }) |> dense_rank())


# A tibble: 14 × 4
# Groups:   ID, ID2 [5]
      ID ID2   date       counter
   <dbl> <chr> <date>       <int>
 1     1 "a "  2020-01-01       1
 2     1 "a "  2020-01-02       1
 3     1 "a "  2020-01-05       2
 4     1 "b "  2020-01-06       1
 5     1 "b "  2020-01-10       2
 6     1 "b "  2020-01-11       2
 7     2 "a "  2020-01-01       1
 8     2 "a "  2020-01-05       2
 9     2 "a "  2020-01-06       2
10     2 "b "  2020-01-10       1
11     2 "c "  2020-01-11       1
12     2 "c "  2020-01-11       1
13     2 "c "  2020-01-16       2
14     2 "c "  2020-01-24       3


  •  Tags:  
  • r
  • Related