Home > Software engineering >  Create a loop to record next record time and date (based on unique id) in same row
Create a loop to record next record time and date (based on unique id) in same row

Time:12-19

I'm trying to create a loop and a if/else statement to pull the next row's timing to record it as timeout. In the event if there is no next row (i.e. no car id#) to return as end/exit. Data and envisioned output

Here is my code but it doesnt work at all probably not getting the fundamentals right.

for(i in 1:dim(df2)[1]){
  if(df2$car.id[i] == df2$car.id[i  1]){
    return$timein[i 1]
  }else{
    print("end")
  } 
    }
)

CodePudding user response:

Several notes below, but up front try this:

df2$Timeout <- ave(df2$Timein, df2$car.id, FUN = function(z) c(z[-1], NA))

The above code returns the next value of df2$Timein per df2$car.id, properly resetting when the next row is a different car.id. The order matters, so if you need the Timein sorted, then you should sort it before calling ave(.). (This will deal correctly with car.id being out of order and even mixed.)

Issues with your (image of) code. If the above doesn't work, you'll need to clarify these points.

  1. nrow(df2) is the more canonical method than dim(df2)[1].

  2. Instead of nrow(.), though, I recommend for (i in seq_len(nrow(df2))), as it behaves better in at least one corner case.

  3. Your [i 1] indexing will go beyond the index limit and return NA, which will eventually error with missing value where TRUE/FALSE needed.

  4. return$timein[i 1] seems wrong, unless you have a named-list or data.frame object that is return; I discourage that, as it can be confused (by people) with the base R primitive (function) return(.). If it is not an object, then you are using it wrong, and frankly I don't know what it should be since a for loop here seems unnecessary.

  5. Your expected output is not fully clear, but I'll guess that you want either a timestamp or the literal "End". The latter will break your timestamps, converting them from POSIXt-class objects to strings. In general, a column in a frame cannot be mixed classes.

CodePudding user response:

Try using dplyr. Working with toy data.

library(dplyr)

dat %>% group_by( car.id ) %>% 
  mutate( Timeout=lead(as.character(Timein), default="END") ) %>% ungroup
# A tibble: 10 x 4
      car.id car.type Timein              Timeout            
       <dbl>    <dbl> <dttm>              <chr>              
 1  14359825        1 2021-12-18 17:28:58 2021-12-18 17:33:58
 2  14359825        1 2021-12-18 17:33:58 2021-12-18 18:03:58
 3  14359825        1 2021-12-18 18:03:58 2021-12-18 18:08:58
 4  14359825        1 2021-12-18 18:08:58 2021-12-18 18:13:58
 5  14359825        1 2021-12-18 18:13:58 END                
 6 243095743        2 2021-12-18 18:30:38 2021-12-18 18:37:18
 7 243095743        2 2021-12-18 18:37:18 2021-12-18 19:17:18
 8 243095743        2 2021-12-18 19:17:18 2021-12-18 19:23:58
 9 243095743        2 2021-12-18 19:23:58 2021-12-18 19:30:38
10 243095743        2 2021-12-18 19:30:38 END 

If you want a date-only Timeout column you can always recast

as.POSIXct( dat$Timeout, format="%F %T" )
 [1] "2021-12-18 17:33:58 CET" "2021-12-18 18:03:58 CET"
 [3] "2021-12-18 18:08:58 CET" "2021-12-18 18:13:58 CET"
 [5] NA                        "2021-12-18 18:37:18 CET"
 [7] "2021-12-18 19:17:18 CET" "2021-12-18 19:23:58 CET"
 [9] "2021-12-18 19:30:38 CET" NA

or directly use

dat %>% group_by( car.id ) %>% mutate( Timeout=lead( Timein ) )

Data

dat <- structure(list(car.id = c(14359825, 14359825, 14359825, 14359825, 
14359825, 243095743, 243095743, 243095743, 243095743, 243095743
), car.type = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2), Timein = structure(c(1639844938.6685, 
1639845238.6685, 1639847038.6685, 1639847338.6685, 1639847638.6685, 
1639848638.6685, 1639849038.6685, 1639851438.6685, 1639851838.6685, 
1639852238.6685), class = c("POSIXct", "POSIXt"))), row.names = c(NA, 
10L), class = "data.frame")
  • Related