I'm trying to create a loop and a if/else statement to pull the next row's timing to record it as timeout. In the event if there is no next row (i.e. no car id#) to return as end/exit. Data and envisioned output
Here is my code but it doesnt work at all probably not getting the fundamentals right.
for(i in 1:dim(df2)[1]){
if(df2$car.id[i] == df2$car.id[i 1]){
return$timein[i 1]
}else{
print("end")
}
}
)
CodePudding user response:
Several notes below, but up front try this:
df2$Timeout <- ave(df2$Timein, df2$car.id, FUN = function(z) c(z[-1], NA))
The above code returns the next value of df2$Timein
per df2$car.id
, properly resetting when the next row is a different car.id
. The order matters, so if you need the Timein
sorted, then you should sort it before calling ave(.)
. (This will deal correctly with car.id
being out of order and even mixed.)
Issues with your (image of) code. If the above doesn't work, you'll need to clarify these points.
nrow(df2)
is the more canonical method thandim(df2)[1]
.Instead of
nrow(.)
, though, I recommendfor (i in seq_len(nrow(df2)))
, as it behaves better in at least one corner case.Your
[i 1]
indexing will go beyond the index limit and return NA, which will eventually error withmissing value where TRUE/FALSE needed
.return$timein[i 1]
seems wrong, unless you have a named-list ordata.frame
object that isreturn
; I discourage that, as it can be confused (by people) with the base R primitive (function)return(.)
. If it is not an object, then you are using it wrong, and frankly I don't know what it should be since a for loop here seems unnecessary.Your expected output is not fully clear, but I'll guess that you want either a timestamp or the literal
"End"
. The latter will break your timestamps, converting them fromPOSIXt
-class objects to strings. In general, a column in a frame cannot be mixed classes.
CodePudding user response:
Try using dplyr
. Working with toy data.
library(dplyr)
dat %>% group_by( car.id ) %>%
mutate( Timeout=lead(as.character(Timein), default="END") ) %>% ungroup
# A tibble: 10 x 4
car.id car.type Timein Timeout
<dbl> <dbl> <dttm> <chr>
1 14359825 1 2021-12-18 17:28:58 2021-12-18 17:33:58
2 14359825 1 2021-12-18 17:33:58 2021-12-18 18:03:58
3 14359825 1 2021-12-18 18:03:58 2021-12-18 18:08:58
4 14359825 1 2021-12-18 18:08:58 2021-12-18 18:13:58
5 14359825 1 2021-12-18 18:13:58 END
6 243095743 2 2021-12-18 18:30:38 2021-12-18 18:37:18
7 243095743 2 2021-12-18 18:37:18 2021-12-18 19:17:18
8 243095743 2 2021-12-18 19:17:18 2021-12-18 19:23:58
9 243095743 2 2021-12-18 19:23:58 2021-12-18 19:30:38
10 243095743 2 2021-12-18 19:30:38 END
If you want a date-only Timeout
column you can always recast
as.POSIXct( dat$Timeout, format="%F %T" )
[1] "2021-12-18 17:33:58 CET" "2021-12-18 18:03:58 CET"
[3] "2021-12-18 18:08:58 CET" "2021-12-18 18:13:58 CET"
[5] NA "2021-12-18 18:37:18 CET"
[7] "2021-12-18 19:17:18 CET" "2021-12-18 19:23:58 CET"
[9] "2021-12-18 19:30:38 CET" NA
or directly use
dat %>% group_by( car.id ) %>% mutate( Timeout=lead( Timein ) )
Data
dat <- structure(list(car.id = c(14359825, 14359825, 14359825, 14359825,
14359825, 243095743, 243095743, 243095743, 243095743, 243095743
), car.type = c(1, 1, 1, 1, 1, 2, 2, 2, 2, 2), Timein = structure(c(1639844938.6685,
1639845238.6685, 1639847038.6685, 1639847338.6685, 1639847638.6685,
1639848638.6685, 1639849038.6685, 1639851438.6685, 1639851838.6685,
1639852238.6685), class = c("POSIXct", "POSIXt"))), row.names = c(NA,
10L), class = "data.frame")