I am looking for an equivalent to Excel's 24 within R. For example
You want to calculate the duration of 2 times:
Column 1 | Column 2 |
---|---|
22:00 | 04:00 |
04:00 | 08:00 |
When I use duplicate() function in R, it calculates Row B correctly, to total 4 hours.
But in Row B, because the duration goes over 24 hours, it calculates the duration to -18 hours.
Is there a way of using an equivalent of the 24 function in Excel (24 Col 2, Row A - Col, Row A) in R, to make sure that Row A totals 6 hours?
Many thanks in advance.
CodePudding user response:
Here's a base R solution:
mydf <- read.table(text = "
Column1 Column2
22:00 04:00
04:00 08:00
", header = TRUE)
mytimediff <- function(x, y) {
h <- vector(mode = "numeric", length = length(x))
x <- strptime(x, format = "%H:%M")
y <- strptime(y, format = "%H:%M")
for (i in seq_along(x)) {
if (x[i] < y[i]) {
h[i] <- y[i] - x[i]
} else {
h[i] <- (y[i] as.difftime(1, unit = "days")) - x[i]
}
}
h
}
mydf$DiffHours <- mytimediff(mydf$Column1, mydf$Column2)
mydf
Column1 Column2 DiffHours
1 22:00 04:00 6
2 04:00 08:00 4
Note that the calculated output is just a numeric vector. Further handling would be needed if you want R to treat it as a datetime.
Also, the function created here assumes that the time provided in the second argument is always later than the time provided in the first argument.
CodePudding user response:
Here is a lubridate
solution.
library(lubridate)
diffHours <- function(x, y){
h <- hours(0)
for(i in seq_along(x)){
h[i] <- if(x[i] > y[i])
y[i] hours(24) - x[i]
else y[i] - x[i]
}
h
}
diffHours(hm(df1$Column1), hm(df1$Column2))
#[1] "6H 0M 0S" "4H 0M 0S"
Data
df1 <- read.table(text = "
Column1 Column2
22:00 04:00
04:00 08:00
", header = TRUE)
CodePudding user response:
Similar to Rui Barradas solution:
library(dplyr)
library(lubridate)
mydf %>%
mutate(col = if_else(hm(Column2) - hm(Column1) > 0,
hm(Column2) - hm(Column1),
hm(Column2) - hm(Column1) hm("24:00")))
This returns
Column1 Column2 col
1 22:00 04:00 6H 0M 0S
2 04:00 08:00 4H 0M 0S
CodePudding user response:
Here's a solution that converts everything to dates first by pasting the origin and adds the seconds of a day after if there is a new date, i.e. which.max == 1
, finally using rowDiffs
of the matrixStats
package. Result is decimal according to the selected unit.
timeDiff <- function(X, u=3600) {
M <- matrix(as.POSIXct(paste('1970-01-01', as.matrix(X))), ncol=2)
nd <- apply(M, 1, which.max) == 1
M[nd, 2] <- M[nd, 2] 86400
return(as.vector(matrixStats::rowDiffs(M) / u))
}
Usage
timeDiff(df1) ## hours, default
# [1] 6 4
timeDiff(df1, u=60) ## minutes
# [1] 360 240
timeDiff(df1, u=1) ## seconds
# [1] 21600 14400
timeDiff(df1, u=86400) ## days
# [1] 0.2500000 0.1666667
Also works with seconds and minutes:
timeDiff(DF)
# [1] 16.131944 15.386667 10.865278 6.574722 21.262500 8.357778 18.644167
# [8] 10.201389 3.958056 16.640000 23.271111 1.742500 2.324167 18.923333
# [15] 14.415000 19.888889 3.886667 17.932778 23.859167 2.779444 7.576944
# [22] 14.439167 3.162778 19.729167 15.787778
Note: Set u=1
in the function to throw seconds as default. If you don't want to install the package, you could also write apply(M, 1, diff) / u
in the last line but it's slower.
Data:
DF <- structure(list(V1 = c("17:00:19", "01:16:54", "05:32:53", "21:27:57",
"05:01:44", "05:30:36", "19:07:42", "14:15:14", "01:02:20", "09:36:33",
"22:12:24", "11:11:37", "08:00:00", "00:29:13", "01:50:47", "23:59:48",
"23:30:10", "01:36:27", "10:46:17", "13:55:54", "01:05:33", "01:54:57",
"09:15:41", "21:09:35", "04:06:19"), V2 = c("09:08:14", "16:40:06",
"16:24:48", "04:02:26", "02:17:29", "13:52:04", "13:46:21", "00:27:19",
"04:59:49", "02:14:57", "21:28:40", "12:56:10", "10:19:27", "19:24:37",
"16:15:41", "19:53:08", "03:23:22", "19:32:25", "10:37:50", "16:42:40",
"08:40:10", "16:21:18", "12:25:27", "16:53:20", "19:53:35")), class = "data.frame", row.names = c(NA,
-25L))