Home > Software design >  How can group the consecutive days by ids?
How can group the consecutive days by ids?

Time:11-09

I can find the consecutive days with that code.

df$consecutive<-c(NA, diff(df$activitydate)==1) 

But I want to find consecutive day for each id. I tried group_by(id) function with pipe. But it doesn't work. My data looks like that:

  # A tibble: 6 x 4
  id activitydate totalsteps consecutive                                                  
 <dbl> <date>            <dbl> <lgl>      
 1503960366 2016-05-12            0 NA                                
 1844505072 2016-04-24            0 FALSE                                                                                                                                    
 1844505072 2016-04-25            0 TRUE       
 1844505072 2016-04-26            0 TRUE                              
 1844505072 2016-05-02            0 FALSE      
 1844505072 2016-05-07            0 FALSE 

I think it is very easy but I am naive in R.

CodePudding user response:

You could use

library(dplyr)

df %>% 
  group_by(id) %>% 
  mutate(consec_2 = c(NA, diff(activitydate) == 1)) %>%
  ungroup()

This returns

# A tibble: 6 x 5
          id activitydate totalsteps consecutive consec_2
       <dbl> <date>            <dbl> <lgl>       <lgl>   
1 1503960366 2016-05-12            0 NA          NA      
2 1844505072 2016-04-24            0 FALSE       NA      
3 1844505072 2016-04-25            0 TRUE        TRUE    
4 1844505072 2016-04-26            0 TRUE        TRUE    
5 1844505072 2016-05-02            0 FALSE       FALSE   
6 1844505072 2016-05-07            0 FALSE       FALSE

CodePudding user response:

In case you prefer a base R solution:


df <- read.table(text = "
 1503960366 2016-05-12                                            
 1844505072 2016-04-24                                                                                                                                               
 1844505072 2016-04-25                  
 1844505072 2016-04-26                                         
 1844505072 2016-05-02               
 1844505072 2016-05-07", colClasses = c("factor", "Date"))


df$consecutive = ave(as.numeric(df$V2), df$V1, FUN = function(x) c(NA, diff(x) == 1))

> df

          V1         V2 consecutive
1 1503960366 2016-05-12          NA
2 1844505072 2016-04-24          NA
3 1844505072 2016-04-25           1
4 1844505072 2016-04-26           1
5 1844505072 2016-05-02           0
6 1844505072 2016-05-07           0

  • Related