Home > Software engineering >  Count variable until observations changes
Count variable until observations changes

Time:12-18

Unfortunately, I can't wrap my head around this but I'm sure there is a straightforward solution. I've a data.frame that looks like this:

set.seed(1)
mydf <- data.frame(group=sample(c("a", "b"), 20, replace=T))

I'd like to create a new variable that counts from top to bottom, how many times the group occured in a row. Hence, within the example from above it should look like this:

mydf$question <- c(1, 2, 1, 2, 1, 1, 2, 3, 4, 1, 2, 3, 1, 1, 1, 1, 1, 2, 1, 1)
> mydf[1:10,]
   group question
1      a        1
2      a        2
3      b        1
4      b        2
5      a        1
6      b        1
7      b        2
8      b        3
9      b        4
10     a        1

Thanks for help.

CodePudding user response:

Using data.table::rleid and dplyr you could do:

set.seed(1)
mydf <- data.frame(group=sample(c("a", "b"), 20, replace=T))

library(dplyr)
library(data.table)

mydf %>% 
  mutate(id = data.table::rleid(group)) %>% 
  group_by(id) %>% 
  mutate(question = row_number()) %>% 
  ungroup()
#> # A tibble: 20 × 3
#>    group    id question
#>    <chr> <int>    <int>
#>  1 a         1        1
#>  2 b         2        1
#>  3 a         3        1
#>  4 a         3        2
#>  5 b         4        1
#>  6 a         5        1
#>  7 a         5        2
#>  8 a         5        3
#>  9 b         6        1
#> 10 b         6        2
#> 11 a         7        1
#> 12 a         7        2
#> 13 a         7        3
#> 14 a         7        4
#> 15 a         7        5
#> 16 b         8        1
#> 17 b         8        2
#> 18 b         8        3
#> 19 b         8        4
#> 20 a         9        1

CodePudding user response:

Update: Most is the same as stefan but without data.table package:

library(dplyr)
mydf %>% 
  mutate(myrleid = with(rle(group), rep(seq_along(lengths), lengths))) %>% 
  group_by(myrleid) %>% 
  mutate(question = row_number()) %>% 
  ungroup()
group myrleid question
   <chr>   <int>    <int>
 1 a           1        1
 2 b           2        1
 3 a           3        1
 4 a           3        2
 5 b           4        1
 6 a           5        1
 7 a           5        2
 8 a           5        3
 9 b           6        1
10 b           6        2
11 a           7        1
12 a           7        2
13 a           7        3
14 a           7        4
15 a           7        5
16 b           8        1
17 b           8        2
18 b           8        3
19 b           8        4
20 a           9        1
  •  Tags:  
  • r
  • Related