Home > Mobile >  Add iterating group number to dataframe
Add iterating group number to dataframe

Time:12-08

I would like to add a "group number" to each group in my dataframe. For example, if I have a dataframe like this

dat <- tibble(
  user_id = c(1,2,1,1,2,2,1)
)

I can add a column noting whether or not the user changed by adding

dat <- dat %>% mutate(same_user = user_id==lag(user_id))

I would like to add a "group number, where sequence is important, i.e. I can't just group by user_id or something. In this case, the sequential group number column would be c(1,2,3,3,4,4,5).

I tried using:

dat <- dat %>% mutate(turn_idx = if_else(row_number()==1,1,
                                  if_else(same_user==TRUE,turn_idx,turn_idx 1)))

but I can't use the variable I just created within the same variable, and get this error:

Error: Problem with `mutate()` input `turn_idx`.
x object 'turn_idx' not found
ℹ Input `turn_idx` is `if_else(...)`.

Any idea how to create this?

CodePudding user response:

Simply using data.table::rleid will helps.

library(data.table)

rleid(dat$user_id)
[1] 1 2 3 3 4 4 5

dat %>%
  mutate(turn_idx = rleid(user_id))

  user_id turn_idx
    <dbl>    <int>
1       1        1
2       2        2
3       1        3
4       1        3
5       2        4
6       2        4
7       1        5
  • Related