Home > other >  Question regarding finding the difference between two datetime variables *updated*
Question regarding finding the difference between two datetime variables *updated*

Time:04-02

I have a dataset which include two columns (trip_start_date,trip_end_date). both of these columns were chr datatype so i converted them into dttm using this code:

df[["started_at"]] <-  as.POSIXct(df[["started_at"]], format= "%Y-%m-%d %H:%M:%S") %>%  ymd_hms()
df[["ended_at"]] <-  as.POSIXct(df[["ended_at"]], format= "%Y-%m-%d %H:%M:%S") %>% ymd_hms()

then i try to get the difference between both columns by this code:

df1 <- df %>%
  difftime(ended_at,started_at, units = 'mins')

But i receive this error

Error in as.POSIXct.default(time1, tz = tz) : 
  do not know how to convert 'time1' to class “POSIXct”* 

Can I have any tip to solve this issue?

dataframe

head(df)

ride_id        rideable_type started_at          ended_at            start_station_n~ start_station_id end_station_name
  <chr>          <chr>         <dttm>              <dttm>              <chr>            <chr>            <chr>           
1 CFA86D4455AA1~ classic_bike  2021-03-16 08:32:30 2021-03-16 08:36:34 Humboldt Blvd &~ 15651            Stave St & Armi~
2 30D9DC61227D1~ classic_bike  2021-03-28 01:26:28 2021-03-28 01:36:55 Humboldt Blvd &~ 15651            Central Park Av~
3 846D87A15682A~ classic_bike  2021-03-11 21:17:29 2021-03-11 21:33:53 Shields Ave & 2~ 15443            Halsted St & 35~
4 994D05AA75A16~ classic_bike  2021-03-11 13:26:42 2021-03-11 13:55:41 Winthrop Ave & ~ TA1308000021     Broadway & Sher~
5 DF7464FBE92D8~ classic_bike  2021-03-21 09:09:37 2021-03-21 09:27:33 Glenwood Ave & ~ 525              Chicago Ave & S~
6 CEBA8516FD17F~ classic_bike  2021-03-20 11:08:47 2021-03-20 11:29:39 Glenwood Ave & ~ 525              Chicago Ave & S~
# ... with 6 more variables: end_station_id <chr>, start_lat <dbl>, start_lng <dbl>, end_lat <dbl>, end_lng <dbl>,
#   member_casual <chr>
 dput(head(df[, c( 3,4)]))

structure(list(started_at = structure(c(1615883550, 1616894788, 
1615497449, 1615469202, 1616317777, 1616238527), tzone = "UTC", class = c("POSIXct", 
"POSIXt")), ended_at = structure(c(1615883794, 1616895415, 1615498433, 
1615470941, 1616318853, 1616239779), tzone = "UTC", class = c("POSIXct", 
"POSIXt"))), row.names = c(NA, -6L), class = c("tbl_df", "tbl", 
"data.frame"))

CodePudding user response:

You are missing a mutate.

library(dplyr)
df %>%
  as_tibble() %>% 
  mutate(diff = difftime(ended_at, started_at, units = 'mins'))
#> # A tibble: 6 x 3
#>   started_at          ended_at            diff          
#>   <dttm>              <dttm>              <drtn>        
#> 1 2021-03-16 08:32:30 2021-03-16 08:36:34  4.066667 mins
#> 2 2021-03-28 01:26:28 2021-03-28 01:36:55 10.450000 mins
#> 3 2021-03-11 21:17:29 2021-03-11 21:33:53 16.400000 mins
#> 4 2021-03-11 13:26:42 2021-03-11 13:55:41 28.983333 mins
#> 5 2021-03-21 09:09:37 2021-03-21 09:27:33 17.933333 mins
#> 6 2021-03-20 11:08:47 2021-03-20 11:29:39 20.866667 mins

With the pipe you say "use the result of the left-hand side and insert it as the first argument of the function on the right-hand side". Therefore your initial code would mean:

# your code
df %>% difftime(ended_at, started_at, units = 'mins')

# "unpiped" version of your code which does not make sense as-is
difftime(df, ended_at, started_at, units = 'mins')

# either use mutate as shown above or use the following
difftime(df$ended_at, df$started_at, units = 'mins')
  •  Tags:  
  • r
  • Related