Home > Back-end >  Find the longest consecutive day
Find the longest consecutive day

Time:02-18

I just don't figure out what I am doing wrong. I want to obtain the longest consecutive days having the value >= 1. My training data looks like this :

df <- data.frame(id =1, value = c(0 , 0 , 0 , 0 , 0  ,0  ,0 , 0 , 0,  0 , 0,  0,  0 , 0 , 0 , 0 , 0 , 0 , 0,  0,  0 , 0  ,0  ,0 , 0,  0,  0 , 0,  0,  0 , 0 , 0 , 0 , 0 , 0 , 0  ,0,  0 , 0,  0,  0 , 0,
                                   0,  0  ,0  ,0 , 0 , 0  ,0  ,0 , 0,  0 , 0 , 0 , 0 , 0  ,0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0  ,0 , 0,  0,  0 , 0  ,0 , 0 , 0,  0  ,0 , 0  ,0  ,0 , 0,  0 , 0,  0,  0,
                                   0 , 0 , 0 , 0,  0 , 0 , 0 , 0 , 0  ,0 , 0 , 0 , 0  ,0 , 0 , 0 , 0  ,0 , 0 , 0  ,0  ,0  ,0 , 0 , 0 , 0  ,0 , 0 , 0  ,0 , 0,  0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0  ,0  ,0,
                                   0 , 0 , 1 , 0 , 0  ,8 , 8,  5  ,3,  3 , 1  ,0 , 0 , 0,  0,  0 , 0 ,10 , 6 , 5  ,4 , 3,  3,  5,  7 , 8 , 7  ,6 , 5  ,4,  3 , 2  ,0 , 0 , 0 , 0 , 0 , 0 , 0 , 0  ,0 , 0,
                                   0 , 0 , 0 , 0,  1 , 1 , 2 , 2,  2 , 2 , 2 , 2 , 2 , 2 , 0  ,0  ,0,  0 , 0 , 0,  0,  0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0  ,0 , 0  ,0, 12,  9,  8,  6 , 5,  5 , 4,  0 , 0,
                                   0 , 0 , 0 , 0 , 0 , 0  ,0 , 0  ,0,  0 , 0 , 0 , 0  ,0  ,0 , 0 , 0 , 0 , 0  ,0 , 0 , 6 , 7 , 3 , 0 , 0 , 0,  0 , 0,  0,  0,  0  ,0 , 0  ,0  ,0,  0 , 0 , 0,  0,  0  ,0,
                                   0 , 0 , 0 , 2  ,0 , 0 , 0  ,0  ,0 , 0 , 0,  0 , 0,  0 , 0,  0,  0 , 0,  0  ,0  ,0  ,0 , 0,  0 , 0,  0 , 0 , 0 , 0 , 0 , 0 , 0  ,0  ,0  ,0  ,0,  0 , 0 , 0,  0,  0,  0,
                                   0 , 0 , 0 , 0 , 0 , 0 , 0  ,0  ,0 , 0 , 0,  0,  0 , 0,  0 , 0 , 0  ,0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0,  0,  0  ,0 , 0,  0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0 , 0,
                                   0 , 0 , 0  ,0 , 0  ,0 , 0 , 0 , 0 , 0,  0  ,0  ,0,  0 , 0 , 0,  0  ,0 , 0 , 0,  0,  0 , 0 , 0,  0,  0 , 0 , 0  ,0  ,0))
  

cons <- max(rle(df$value >=1)$lengths)
cons

The result is 128, which is wrong. The maximum length ( maximum consecutive days having values greater than 1) is correct to be 15. It seems that the filter >=1 doesn't work.

CodePudding user response:

Broken down, you can use rle where value is greater or equal to 1:

my_rle <- rle(df$value >= 1)
my_rle

Run Length Encoding
  lengths: int [1:15] 128 1 2 6 6 15 14 10 19 7 ...
  values : logi [1:15] FALSE TRUE FALSE TRUE FALSE TRUE ...

Then, you can subset and take the maximum of lengths only where the values in the rle result are TRUE:

max(my_rle$lengths[my_rle$values])

Output

[1] 15

CodePudding user response:

cons <- max(rle(df$value >=1)$lengths[which(rle(df$value >=1)$value == TRUE)])
cons

CodePudding user response:

here some solution with data.tabel:

library(data.table)

df %>%
mutate(grp = data.table::rleid(value>0)) %>%
filter(value>0) %>%
count(grp, name = 'value') %>%
select(-grp) %>% 
max(df$value) 
  • Related