Suppose I have this dataframe:
df <- data.frame(co11 = c(rep(1, 5), 5, 6, rep(1, 3), 2, 3, 4, 5, 8, rep(1, 2), rep(2, 2), 8, 10))
I would like to create another column (col2
) with increasing group index whenever a value in a row is at least 5. To illustrate, here is the resulting df
that I would like to get:
co11 col2
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 5 2
7 6 3
8 1 3
9 1 3
10 1 3
11 2 3
12 3 3
13 4 3
14 5 4
15 8 5
16 1 5
17 1 5
18 2 5
19 2 5
20 8 6
21 10 7
Is there an available function in dplyr
that can do this? Thanks!
CodePudding user response:
You could use pmax
to find maximum value of each row, and cumsum
to sum occurences above 5
:
df %>% mutate(newcol=cumsum(do.call(pmax,select(.,everything()))>=5) 1)
co11 newcol
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 5 2
7 6 3
8 1 3
9 1 3
10 1 3
11 2 3
12 3 3
13 4 3
14 5 4
15 8 5
16 1 5
17 1 5
18 2 5
19 2 5
20 8 6
21 10 7
CodePudding user response:
Waldi's answer is very good, here is a slightly modified version:
library(dplyr)
df %>%
group_by(col2 =cumsum(co11 >= 5) 1)
co11 col2
1 1 1
2 1 1
3 1 1
4 1 1
5 1 1
6 5 2
7 6 3
8 1 3
9 1 3
10 1 3
11 2 3
12 3 3
13 4 3
14 5 4
15 8 5
16 1 5
17 1 5
18 2 5
19 2 5
20 8 6
21 10 7