I have a data frame with boolean data per column. I want to know the length (count) of each period when there were consecutive FALSE values but with the possibility of having TRUE value in between, if it doesn't exceed 10 raws in a row. If there are more than 10 TRUE values in a row, then the counting should start from 0 from the next FALSE values. Eventually I want to have the number of such periods per column and length of each period. I know there are plenty posts on finding number of consecutive TRUE values, but nothing on the possibility of using conditions.
I tried to use rleid function together with rowid from the data.table package, but all I got was the same of consecutive TRUE values. Here is the code I used to test on a random vector:
rowid(rleid(a))*a
From there on I am stuck.
Ideally from the vector
a <- c(FALSE, FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, TRUE,TRUE, TRUE,TRUE, TRUE,TRUE, TRUE, FALSE, FALSE)
I want to get a vector: 9, 2
CodePudding user response:
Replace the TRUE values with NA, fill in the NA's with false if there are 9 or fewer and then replace the NA's with TRUE. Then use rle.
library(magrittr)
library(zoo)
a %>%
replace(., ., NA) %>%
na.locf0(maxgap = 9) %>%
replace(., is.na(.), TRUE) %>%
rle %>%
with(lengths[!values])
## [1] 9 2
CodePudding user response:
Just use rle
and hack the result to set a sequence of less than 10 TRUE
to be FALSE
. Not clear your desired output, but you can try something like:
r<-rle(a)
r$values[r$lengths<10 & r$values==TRUE]<-FALSE
(r2<-rle(inverse.rle(r)))
#Run Length Encoding
# lengths: int [1:3] 9 10 2
# values : logi [1:3] FALSE TRUE FALSE
r2$lengths[!r2$values]
#[1] 9 2