Home > Software design >  How to create an increasing index based on a certain condition?
How to create an increasing index based on a certain condition?

Time:02-13

Suppose I have this dataframe:

df <- data.frame(co11 = c(rep(1, 5), 5, 6, rep(1, 3), 2, 3, 4, 5, 8, rep(1, 2), rep(2, 2), 8, 10))

I would like to create another column (col2) with increasing group index whenever a value in a row is at least 5. To illustrate, here is the resulting df that I would like to get:

   co11  col2
1     1     1
2     1     1
3     1     1
4     1     1
5     1     1
6     5     2
7     6     3
8     1     3
9     1     3
10    1     3
11    2     3
12    3     3
13    4     3
14    5     4
15    8     5
16    1     5
17    1     5
18    2     5
19    2     5
20    8     6
21   10     7

Is there an available function in dplyr that can do this? Thanks!

CodePudding user response:

You could use pmax to find maximum value of each row, and cumsum to sum occurences above 5 :

df  %>% mutate(newcol=cumsum(do.call(pmax,select(.,everything()))>=5) 1)

   co11 newcol
1     1      1
2     1      1
3     1      1
4     1      1
5     1      1
6     5      2
7     6      3
8     1      3
9     1      3
10    1      3
11    2      3
12    3      3
13    4      3
14    5      4
15    8      5
16    1      5
17    1      5
18    2      5
19    2      5
20    8      6
21   10      7

CodePudding user response:

Waldi's answer is very good, here is a slightly modified version:

library(dplyr)

df %>% 
  group_by(col2 =cumsum(co11 >= 5) 1) 
 co11 col2
1     1    1
2     1    1
3     1    1
4     1    1
5     1    1
6     5    2
7     6    3
8     1    3
9     1    3
10    1    3
11    2    3
12    3    3
13    4    3
14    5    4
15    8    5
16    1    5
17    1    5
18    2    5
19    2    5
20    8    6
21   10    7
  • Related