Equivalent function in R to "24" in Excel to measure time durations?-CodePudding

I am looking for an equivalent to Excel's 24 within R. For example

You want to calculate the duration of 2 times:

Column 1	Column 2
22:00	04:00
04:00	08:00

When I use duplicate() function in R, it calculates Row B correctly, to total 4 hours.

But in Row B, because the duration goes over 24 hours, it calculates the duration to -18 hours.

Is there a way of using an equivalent of the 24 function in Excel (24 Col 2, Row A - Col, Row A) in R, to make sure that Row A totals 6 hours?

Many thanks in advance.

CodePudding user response：

Here's a base R solution:

mydf <- read.table(text = "
Column1     Column2
22:00   04:00
04:00   08:00
", header = TRUE)

mytimediff <- function(x, y) {
  h <- vector(mode = "numeric", length = length(x))
  x <- strptime(x, format = "%H:%M")
  y <- strptime(y, format = "%H:%M")
  for (i in seq_along(x)) {
    if (x[i] < y[i]) {
      h[i] <- y[i] - x[i]
    } else {
      h[i] <- (y[i]   as.difftime(1, unit = "days")) - x[i]
    }
  }
  h
}

mydf$DiffHours <- mytimediff(mydf$Column1, mydf$Column2)
mydf

  Column1 Column2 DiffHours
1   22:00   04:00         6
2   04:00   08:00         4

Note that the calculated output is just a numeric vector. Further handling would be needed if you want R to treat it as a datetime.

Also, the function created here assumes that the time provided in the second argument is always later than the time provided in the first argument.

CodePudding user response：

Here is a lubridate solution.

library(lubridate)

diffHours <- function(x, y){
  h <- hours(0)
  for(i in seq_along(x)){
    h[i] <- if(x[i] > y[i]) 
      y[i]   hours(24) - x[i] 
    else y[i] - x[i]
  }
  h
}

diffHours(hm(df1$Column1), hm(df1$Column2))
#[1] "6H 0M 0S" "4H 0M 0S"

Data

df1 <- read.table(text = "
Column1     Column2
22:00   04:00
04:00   08:00
", header = TRUE)

CodePudding user response：

Similar to Rui Barradas solution:

library(dplyr)
library(lubridate)

mydf %>% 
  mutate(col = if_else(hm(Column2) - hm(Column1) > 0, 
                       hm(Column2) - hm(Column1),
                       hm(Column2) - hm(Column1)   hm("24:00")))

This returns

  Column1 Column2      col
1   22:00   04:00 6H 0M 0S
2   04:00   08:00 4H 0M 0S

CodePudding user response：

Here's a solution that converts everything to dates first by pasting the origin and adds the seconds of a day after if there is a new date, i.e. which.max == 1, finally using rowDiffs of the matrixStats package. Result is decimal according to the selected unit.

timeDiff <- function(X, u=3600) {
  M <- matrix(as.POSIXct(paste('1970-01-01', as.matrix(X))), ncol=2)
  nd <- apply(M, 1, which.max) == 1
  M[nd, 2] <- M[nd, 2]   86400
  return(as.vector(matrixStats::rowDiffs(M) / u))
}

Usage

timeDiff(df1)  ## hours, default
# [1] 6 4
timeDiff(df1, u=60)  ## minutes
# [1] 360 240
timeDiff(df1, u=1)  ## seconds
# [1] 21600 14400
timeDiff(df1, u=86400)  ## days
# [1] 0.2500000 0.1666667

Also works with seconds and minutes:

timeDiff(DF)
# [1] 16.131944 15.386667 10.865278  6.574722 21.262500  8.357778 18.644167
# [8] 10.201389  3.958056 16.640000 23.271111  1.742500  2.324167 18.923333
# [15] 14.415000 19.888889  3.886667 17.932778 23.859167  2.779444  7.576944
# [22] 14.439167  3.162778 19.729167 15.787778

Note: Set u=1 in the function to throw seconds as default. If you don't want to install the package, you could also write apply(M, 1, diff) / u in the last line but it's slower.

Data:

DF <- structure(list(V1 = c("17:00:19", "01:16:54", "05:32:53", "21:27:57", 
"05:01:44", "05:30:36", "19:07:42", "14:15:14", "01:02:20", "09:36:33", 
"22:12:24", "11:11:37", "08:00:00", "00:29:13", "01:50:47", "23:59:48", 
"23:30:10", "01:36:27", "10:46:17", "13:55:54", "01:05:33", "01:54:57", 
"09:15:41", "21:09:35", "04:06:19"), V2 = c("09:08:14", "16:40:06", 
"16:24:48", "04:02:26", "02:17:29", "13:52:04", "13:46:21", "00:27:19", 
"04:59:49", "02:14:57", "21:28:40", "12:56:10", "10:19:27", "19:24:37", 
"16:15:41", "19:53:08", "03:23:22", "19:32:25", "10:37:50", "16:42:40", 
"08:40:10", "16:21:18", "12:25:27", "16:53:20", "19:53:35")), class = "data.frame", row.names = c(NA, 
-25L))