I would like to add a "group number" to each group in my dataframe. For example, if I have a dataframe like this
dat <- tibble(
user_id = c(1,2,1,1,2,2,1)
)
I can add a column noting whether or not the user changed by adding
dat <- dat %>% mutate(same_user = user_id==lag(user_id))
I would like to add a "group number, where sequence is important, i.e. I can't just group by user_id or something. In this case, the sequential group number column would be c(1,2,3,3,4,4,5)
.
I tried using:
dat <- dat %>% mutate(turn_idx = if_else(row_number()==1,1,
if_else(same_user==TRUE,turn_idx,turn_idx 1)))
but I can't use the variable I just created within the same variable, and get this error:
Error: Problem with `mutate()` input `turn_idx`.
x object 'turn_idx' not found
ℹ Input `turn_idx` is `if_else(...)`.
Any idea how to create this?
CodePudding user response:
Simply using data.table::rleid
will helps.
library(data.table)
rleid(dat$user_id)
[1] 1 2 3 3 4 4 5
dat %>%
mutate(turn_idx = rleid(user_id))
user_id turn_idx
<dbl> <int>
1 1 1
2 2 2
3 1 3
4 1 3
5 2 4
6 2 4
7 1 5