Home > database >  How can I use a function that takes as arguments an element of a vector and the next element in the
How can I use a function that takes as arguments an element of a vector and the next element in the

Time:09-23

I have a character vector containing dates and times in a chronological order with the following format: YYYY-MM-DD HH:MM:SS (in R terms: %Y-%m-%d %H:%M:%S). I've also made a function that gives me the time difference in a specific format (get_formatted_time_difference) between two POSIX objects and a function to convert strings to POSIX objects (string_to_POSIX).

What I would like to do is a function that takes my character vector (length: n) as an argument and returns a vector of time differences (length: n-1). The resulting vector's first element should be the time difference between the initial vector's first and second elements. The resulting vector's second element should be the time difference between the initial vector's second and third elements and so on.

For clarity purpose, what I want is:

Initial Vector Resulting Vector
"2022-09-18 17:00:00" "0 day(s), 05:00:00"
"2022-09-18 12:00:00" "0 day(s), 19:00:00"
"2022-09-17 17:00:00" "0 day(s), 05:00:00"
"2022-09-17 12:00:00" "0 day(s), 17:00:00"
"2022-09-16 19:00:00" -

I am unsure how to approach this problem. So far I thought about creating a matrix with two vectors (let's call them v1 and v2) and adding NAs to create some sort of lag like I've shown below:

v1 v2
NA "2022-09-18 17:00:00"
"2022-09-18 17:00:00" "2022-09-18 12:00:00"
"2022-09-18 12:00:00" "2022-09-17 17:00:00"
"2022-09-17 17:00:00" "2022-09-17 12:00:00"
"2022-09-17 12:00:00" "2022-09-16 19:00:00"
"2022-09-16 19:00:00" NA

The idea would be to apply a function from the apply family to all rows that don't contain NA values. The custom function called in the apply function would have to convert the values to POSIX and then use "get_formatted_time_difference" with both columns as argument.

I am not sure how to do this and not quite sure if it's the best way to do it either. Could you tell me how you would go about solving that problem? Also, if your method is different than mine, I'd still be curious to hear about how to use one of the apply functions to solve that problem the way I saw it in the first place.

As always, thanks a lot for your help!

library(reprex)
library(magrittr)

# Functions ---------------------------------------------------------------

get_formatted_time_difference <- function(time1, time2) {
  # Params must be POSIX
  total_amount_seconds <- difftime(time1, time2, units = "sec") %>% abs()
  
  amount_of_days <- floor(total_amount_seconds / (60 * 60 * 24))
  
  amount_of_hours <- floor(
    (total_amount_seconds
     - amount_of_days * (60 * 60 * 24)) / (60 * 60)
  )
  
  amount_of_minutes <- floor(
    (total_amount_seconds
     - amount_of_days * (60 * 60 * 24)
     - amount_of_hours * (60 * 60)) / 60
  )
  
  amount_of_seconds <- floor(
    (total_amount_seconds
     - amount_of_days * (60 * 60 * 24)
     - amount_of_hours * (60 * 60)
     - amount_of_minutes * 60)
  )
  
  hours <- amount_of_hours
  minutes <- amount_of_minutes
  seconds <- amount_of_seconds
  
  if (amount_of_hours < 10) {hours <- paste0("0", amount_of_hours)}
  if (amount_of_minutes < 10) {minutes <- paste0("0", amount_of_minutes)}
  if (amount_of_seconds < 10) {seconds <- paste0("0", amount_of_seconds)}
  
  return(
    paste0(
      floor(amount_of_days), " day(s), ",
      hours, ":", minutes, ":", seconds
    )
  )
}

string_to_POSIX <- function(time) {
  as.POSIXct(time, format = "%Y-%m-%d %H:%M:%S")
}


# Data --------------------------------------------------------------------

events <- c("2022-09-18 17:00:00", 
            "2022-09-18 12:00:00",
            "2022-09-17 17:00:00",
            "2022-09-17 12:00:00",
            "2022-09-16 19:00:00")

# -------------------------------------------------------------------------

v1 <- c(NA, events)
v2 <- c(events, NA)

(mat <- matrix(data = c(v1, v2), ncol = 2))
#>      [,1]                  [,2]                 
#> [1,] NA                    "2022-09-18 17:00:00"
#> [2,] "2022-09-18 17:00:00" "2022-09-18 12:00:00"
#> [3,] "2022-09-18 12:00:00" "2022-09-17 17:00:00"
#> [4,] "2022-09-17 17:00:00" "2022-09-17 12:00:00"
#> [5,] "2022-09-17 12:00:00" "2022-09-16 19:00:00"
#> [6,] "2022-09-16 19:00:00" NA

Created on 2022-09-21 with reprex v2.0.2

CodePudding user response:

Starting from your

events <- c("2022-09-18 17:00:00", 
            "2022-09-18 12:00:00",
            "2022-09-17 17:00:00",
            "2022-09-17 12:00:00",
            "2022-09-16 19:00:00")

Simply diff(as.POSIXlt(events)) does the trick.

  • Related