I have a data frame containing dates and for each date the number of events that took place. From this I add a field telling me if the number of events was above average or not.
Date | Events | Above Average |
---|---|---|
01/01 | 7 | 0 |
02/01 | 8 | 1 |
03/01 | 8 | 1 |
04/01 | 6 | 0 |
05/01 | 8 | 1 |
06/01 | 9 | 1 |
07/01 | 4 | 0 |
08/01 | 7 | 0 |
From this, if I perform an RLE I get
Count | Value |
---|---|
1 | FALSE |
2 | TRUE |
1 | FALSE |
2 | TRUE |
2 | FALSE |
How can I use this information to add an addition field as below to my original data frame:
Date | Events | Above Average | Run Above Av |
---|---|---|---|
01/01 | 7 | 0 | 0 |
02/01 | 8 | 1 | 2 |
03/01 | 8 | 1 | 2 |
04/01 | 6 | 0 | 0 |
05/01 | 8 | 1 | 2 |
06/01 | 9 | 1 | 2 |
07/01 | 4 | 0 | 0 |
08/01 | 7 | 0 | 0 |
CodePudding user response:
You seem to be looking for the rle lengths, each repeated by itself, then multiplied by the sign of the Above Average
column
library(dplyr)
df %>%
mutate(`Run Above Av` = rep(rle(`Above Average`)$lengths,
times = rle(`Above Average`)$lengths) * sign(`Above Average`))
#> Date Events Above Average Run Above Av
#> 1 01/01 7 0 0
#> 2 02/01 8 1 2
#> 3 03/01 8 1 2
#> 4 04/01 6 0 0
#> 5 05/01 8 1 2
#> 6 06/01 9 1 2
#> 7 07/01 4 0 0
#> 8 08/01 7 0 0
Data from question in reproducible format
df <- structure(list(Date = c("01/01", "02/01", "03/01", "04/01", "05/01",
"06/01", "07/01", "08/01"), Events = c(7L, 8L, 8L, 6L, 8L, 9L,
4L, 7L), `Above Average` = c(0L, 1L, 1L, 0L, 1L, 1L, 0L, 0L)),
class = "data.frame", row.names = c(NA, -8L))
Created on 2022-06-22 by the reprex package (v2.0.1)