I have a data.frame
ordered by ID with a column of numeric values that I would like to bin into groups, increasing the group number only when a certain target value/trigger is surpassed. I haven't had success with seq()
, seq_along()
, or data.table
cumsum()
, but I'm sure there must be a way
Example data.frame
with desired group column below. In this example, the sequence generating the group column should increase only when a number >= 300 appears in the value column.
dat = data.frame(ID=1:10, value=c(0,2,1,12,68,300,41,0,72959,51), group=c(1,1,1,1,1,2,2,2,3,3))
> dat
ID value group
1 1 0 1
2 2 2 1
3 3 1 1
4 4 12 1
5 5 68 1
6 6 300 2
7 7 41 2
8 8 0 2
9 9 72959 3
10 10 51 3
CodePudding user response:
We may use cumsum
on a logical vector to create the group
library(dplyr)
dat %>%
mutate(group2 = cumsum(value >=300) 1)
-output
ID value group group2
1 1 0 1 1
2 2 2 1 1
3 3 1 1 1
4 4 12 1 1
5 5 68 1 1
6 6 300 2 2
7 7 41 2 2
8 8 0 2 2
9 9 72959 3 3
10 10 51 3 3