I just don't figure out what I am doing wrong. I want to obtain the longest consecutive days having the value >= 1. My training data looks like this :
df <- data.frame(id =1, value = c(0 , 0 , 0 , 0 , 0 ,0 ,0 , 0 , 0, 0 , 0, 0, 0 , 0 , 0 , 0 , 0 , 0 , 0, 0, 0 , 0 ,0 ,0 , 0, 0, 0 , 0, 0, 0 , 0 , 0 , 0 , 0 , 0 , 0 ,0, 0 , 0, 0, 0 , 0,
0, 0 ,0 ,0 , 0 , 0 ,0 ,0 , 0, 0 , 0 , 0 , 0 , 0 ,0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ,0 , 0, 0, 0 , 0 ,0 , 0 , 0, 0 ,0 , 0 ,0 ,0 , 0, 0 , 0, 0, 0,
0 , 0 , 0 , 0, 0 , 0 , 0 , 0 , 0 ,0 , 0 , 0 , 0 ,0 , 0 , 0 , 0 ,0 , 0 , 0 ,0 ,0 ,0 , 0 , 0 , 0 ,0 , 0 , 0 ,0 , 0, 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ,0 ,0,
0 , 0 , 1 , 0 , 0 ,8 , 8, 5 ,3, 3 , 1 ,0 , 0 , 0, 0, 0 , 0 ,10 , 6 , 5 ,4 , 3, 3, 5, 7 , 8 , 7 ,6 , 5 ,4, 3 , 2 ,0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ,0 , 0,
0 , 0 , 0 , 0, 1 , 1 , 2 , 2, 2 , 2 , 2 , 2 , 2 , 2 , 0 ,0 ,0, 0 , 0 , 0, 0, 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 ,0 , 0 ,0, 12, 9, 8, 6 , 5, 5 , 4, 0 , 0,
0 , 0 , 0 , 0 , 0 , 0 ,0 , 0 ,0, 0 , 0 , 0 , 0 ,0 ,0 , 0 , 0 , 0 , 0 ,0 , 0 , 6 , 7 , 3 , 0 , 0 , 0, 0 , 0, 0, 0, 0 ,0 , 0 ,0 ,0, 0 , 0 , 0, 0, 0 ,0,
0 , 0 , 0 , 2 ,0 , 0 , 0 ,0 ,0 , 0 , 0, 0 , 0, 0 , 0, 0, 0 , 0, 0 ,0 ,0 ,0 , 0, 0 , 0, 0 , 0 , 0 , 0 , 0 , 0 , 0 ,0 ,0 ,0 ,0, 0 , 0 , 0, 0, 0, 0,
0 , 0 , 0 , 0 , 0 , 0 , 0 ,0 ,0 , 0 , 0, 0, 0 , 0, 0 , 0 , 0 ,0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0, 0, 0 ,0 , 0, 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0,
0 , 0 , 0 ,0 , 0 ,0 , 0 , 0 , 0 , 0, 0 ,0 ,0, 0 , 0 , 0, 0 ,0 , 0 , 0, 0, 0 , 0 , 0, 0, 0 , 0 , 0 ,0 ,0))
cons <- max(rle(df$value >=1)$lengths)
cons
The result is 128, which is wrong. The maximum length ( maximum consecutive days having values greater than 1) is correct to be 15. It seems that the filter >=1 doesn't work.
CodePudding user response:
Broken down, you can use rle
where value
is greater or equal to 1:
my_rle <- rle(df$value >= 1)
my_rle
Run Length Encoding
lengths: int [1:15] 128 1 2 6 6 15 14 10 19 7 ...
values : logi [1:15] FALSE TRUE FALSE TRUE FALSE TRUE ...
Then, you can subset and take the maximum of lengths only where the values
in the rle
result are TRUE
:
max(my_rle$lengths[my_rle$values])
Output
[1] 15
CodePudding user response:
cons <- max(rle(df$value >=1)$lengths[which(rle(df$value >=1)$value == TRUE)])
cons
CodePudding user response:
here some solution with data.tabel:
library(data.table)
df %>%
mutate(grp = data.table::rleid(value>0)) %>%
filter(value>0) %>%
count(grp, name = 'value') %>%
select(-grp) %>%
max(df$value)