Home > database >  How can I convert character vector to numeric-time but keep same structure
How can I convert character vector to numeric-time but keep same structure

Time:02-17

I need to convert the "ride_length" column from character vector to numeric while preferably keeping same format HHH:MM:SS. If the only way to accomplish this is to convert to only seconds or minutes, that is an acceptable alternative. Ultimately I need to be able to analyze this data in a meaningful way, which I cannot do while it is in character vector. I have tried strptime(), chron(), POSIXct(), as.numeric() all with no success. "ride_length" was created in EXCEL before being imported.

I found a workaround by creating a new "ride_length" column and then converting to numeric using:

q1_2021$ride_length <- difftime(q1_2021$ended_at, q1_2021$started_at)
q1_2021$ride_length <- as.numeric(as.character(q1_2021$ride_length))

But (if possible) I want to understand how to answer the original question using the EXCEL created "ride_length" column.

Updating with dput(head()) which I'm hoping provides reproducible data. I removed the unnecessary columns:

structure(list(started_at = c("2/6/2021 15:56", "2/5/2021 14:22", "2/6/2021 20:21", "2/27/2021 21:07", "2/20/2021 23:23", "2/28/2021 17:50" ), ended_at = c("2/27/2021 14:06", "2/26/2021 9:42", "2/13/2021 11:28", "3/5/2021 15:11", "2/25/2021 16:12", "3/5/2021 2:14"), ride_length = c("502:09:14", "499:20:38", "159:07:08", "138:04:00", "112:49:54", "104:24:14" )), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame" ))

Rstudio screenshot

CodePudding user response:

Since there is not a base R function that works for you, why not create your own?

The following function converts durations in the given format to seconds:

as_seconds <- function(durations) {
  vapply(strsplit(durations, ":"), function(x) {
    sum(c(rep(0, 3 - length(x)), as.numeric(x)) * c(3600, 60, 1))
  }, 1)
}

Now, since you don't have reproducible data (we can't copy-paste data from a screen shot), let's create a simple sample vector:

times <- c("332:21:46", "254:12:01", "1:22", "13:12:01")

So we can do:

as_seconds(times)
#> [1] 1196506  915121      82   47521

It's quite reasonable to just use the number of seconds for analysis: remember you can store these in a different column so you can still have the durations in character format for display. There are other things you can do with the seconds, for example convert them into durations using the lubridate package:

lubridate::seconds_to_period(as_seconds(times))
#> [1] "13d 20H 21M 46S" "10d 14H 12M 1S"  "1M 22S"          "13H 12M 1S" 

If you only want to keep the character format in your data frame, you can just convert to seconds on demand. For example, we can use order along with our as_seconds function to put the durations in order:

times[order(as_seconds(times))]
#> [1] "1:22"      "13:12:01"  "254:12:01" "332:21:46"

Or reverse order:

times[order(-as_seconds(times))]
#> [1] "332:21:46" "254:12:01" "13:12:01"  "1:22"

Created on 2022-02-16 by the reprex package (v2.0.1)

  •  Tags:  
  • r
  • Related