How to create an increasing index based on a certain condition?-CodePudding

Suppose I have this dataframe:

df <- data.frame(co11 = c(rep(1, 5), 5, 6, rep(1, 3), 2, 3, 4, 5, 8, rep(1, 2), rep(2, 2), 8, 10))

I would like to create another column (col2) with increasing group index whenever a value in a row is at least 5. To illustrate, here is the resulting df that I would like to get:

   co11  col2
1     1     1
2     1     1
3     1     1
4     1     1
5     1     1
6     5     2
7     6     3
8     1     3
9     1     3
10    1     3
11    2     3
12    3     3
13    4     3
14    5     4
15    8     5
16    1     5
17    1     5
18    2     5
19    2     5
20    8     6
21   10     7

Is there an available function in dplyr that can do this? Thanks!

CodePudding user response：

You could use pmax to find maximum value of each row, and cumsum to sum occurences above 5 :

df  %>% mutate(newcol=cumsum(do.call(pmax,select(.,everything()))>=5) 1)

   co11 newcol
1     1      1
2     1      1
3     1      1
4     1      1
5     1      1
6     5      2
7     6      3
8     1      3
9     1      3
10    1      3
11    2      3
12    3      3
13    4      3
14    5      4
15    8      5
16    1      5
17    1      5
18    2      5
19    2      5
20    8      6
21   10      7

CodePudding user response：

Waldi's answer is very good, here is a slightly modified version:

library(dplyr)

df %>% 
  group_by(col2 =cumsum(co11 >= 5) 1)

 co11 col2
1     1    1
2     1    1
3     1    1
4     1    1
5     1    1
6     5    2
7     6    3
8     1    3
9     1    3
10    1    3
11    2    3
12    3    3
13    4    3
14    5    4
15    8    5
16    1    5
17    1    5
18    2    5
19    2    5
20    8    6
21   10    7