I'm looking to generate a column in a table that iterates each time there is a state change. It's important this is arranged by date
For example, take the following input data
dummy_d = data.frame(date = seq.POSIXt(from = as.POSIXct("2022-02-14 09:00:00"),
to=as.POSIXct("2022-02-14 09:06:00"),
by = "1 min"),
valid = c(1,1,0,0,1,0,0))
The calculated output should look like
dummy_d %>% mutate(group = c(1,1,2,2,3,4,4))
That is to say, arranging by time, each time valid switches from 0 to 1 vice versa, group cumulatively builds.
Any help appreciated
CodePudding user response:
This solution uses the data.table
package.
library(data.table)
setDT(dummy_d)
dummy_d[, group := .GRP, by = rleid(as.Date(date), valid)]
dummy_d
# date valid group
# 1: 2022-02-14 09:00:00 1 1
# 2: 2022-02-14 09:01:00 1 1
# 3: 2022-02-14 09:02:00 0 2
# 4: 2022-02-14 09:03:00 0 2
# 5: 2022-02-14 09:04:00 1 3
# 6: 2022-02-14 09:05:00 0 4
# 7: 2022-02-14 09:06:00 0 4
CodePudding user response:
With dplyr
alone you could try the following:
library(dplyr)
dummy_d %>%
mutate(group = cumsum(valid != lag(valid, default = first(valid))) 1)
Output
date valid group
1 2022-02-14 09:00:00 1 1
2 2022-02-14 09:01:00 1 1
3 2022-02-14 09:02:00 0 2
4 2022-02-14 09:03:00 0 2
5 2022-02-14 09:04:00 1 3
6 2022-02-14 09:05:00 0 4
7 2022-02-14 09:06:00 0 4