Here is the data frame in r,
ride Start time End time
1 2019-09-01 06:02:09 2019-09-01 06:03:09
2 2019-09-01 10:25:21 2019-09-01 10:26:22
3 2019-09-01 11:10:34 2019-09-01 11:11:36
4 2019-09-01 18:10:45 2019-09-01 18:11:47
And I want to create a column in r to express the difference in time between each ride, for example, the difference in time(hours) between second ride and first ride is the start time of second ride minus the end time of first ride. I want the result like this:
ride Start time End time difftime
1 2019-09-01 06:02:09 2019-09-01 06:03:09 NA
2 2019-09-01 10:25:21 2019-09-01 10:26:22 4hours(10:25:21-06:03:09=4hours)
3 2019-09-01 11:10:34 2019-09-01 11:11:36 0.73 hour(11:10:34-10:26:22)
4 2019-09-02 00:10:45 2019-09-01 01:11:47 2019-09-02-00:10:45minus2019-09-01
11:11:36
Thanks in advance!
CodePudding user response:
Offset the end_time vector by one row. Delete the last element in the end_time vector and add a null element to its beginning.
ride Start time End time
1 2019-09-01 06:02:09 0
2 2019-09-01 10:25:21 2019-09-01 06:03:09
3 2019-09-01 11:10:34 2019-09-01 10:26:22
4 2019-09-01 18:10:45 2019-09-01 11:11:36
As such, the end time of ride 1 is matched with the start time of ride 2, and the end time of ride 2 is matched with the start time of ride 3, etc.
You can then add a new column to the data frame using the difftime operator. I used the origin date '1970-01-01' for the first value of the end_time vector below:
ride <- c(1, 2, 3, 4)
start_time <- as.POSIXct(c('2019-09-01 06:02:09', '2019-09-01 10:25:21', '2019-09-01 11:10:34', '2019-09-02 00:10:45'))
end_time <- as.POSIXct(c('1970-01-01', '2019-09-01 06:03:09', '2019-09-01 10:26:22', '2019-09-01 11:11:36'))
ride_data <- data.frame(ride, start_time, end_time)
ride_data$diff <- with(ride_data, difftime(start_time, end_time, units = "secs"))
CodePudding user response:
The biggest challenge was to read your data ;) The difftime problem is fairly straight forward with the tidyverse, in particular {dplyr}.
library(lubridate)
library(tidyverse)
df <-
read.table(text = "ride Start time End time
1 2019-09-01 06:02:09 2019-09-01 06:03:09
2 2019-09-01 10:25:21 2019-09-01 10:26:22
3 2019-09-01 11:10:34 2019-09-01 11:11:36
4 2019-09-01 18:10:45 2019-09-01 18:11:47", header = TRUE) %>%
transmute(ride = ride, start = ymd_hms(paste(Start, time)),
end = ymd_hms(paste(End, time.1)))
df %>%
mutate(difftime = difftime(end, lag(start), units = "hours"))
#> ride start end difftime
#> 1 1 2019-09-01 06:02:09 2019-09-01 06:03:09 NA hours
#> 2 2 2019-09-01 10:25:21 2019-09-01 10:26:22 4.4036111 hours
#> 3 3 2019-09-01 11:10:34 2019-09-01 11:11:36 0.7708333 hours
#> 4 4 2019-09-01 18:10:45 2019-09-01 18:11:47 7.0202778 hours
Created on 2021-11-14 by the reprex package (v2.0.1)