Home > Enterprise >  Adding a new column and calculating ride length from start to finish
Adding a new column and calculating ride length from start to finish

Time:10-20

I'm trying to add a new column to my data set and calculate the ride_length from start to finish.

Example of what glimpse returns:

$ started_at         <chr> "23/01/2021 16:14", 
$ ended_at           <chr> "23/01/2021 16:24",

My code:

data_trip_cleaned$ride_length <- difftime(data_trip_cleaned$started_at,data_trip_cleaned$ended_at,units = "mins")

Error:

Error in as.POSIXlt.character(x, tz, ...) : character string is not in a standard unambiguous format

CodePudding user response:

Your error suggests difftime can't interpret the format of your date/time automatically. From ?difftime:

"Function difftime calculates a difference of two date/time objects and returns an object of class difftime with an attribute indicating the units."

Are your started_at and ended_at class = datetime? If not, look at ?as.POSIXct. Confirm this works like you are expecting:

as.POSIXct("23/01/2021 16:24", format = "%d/%m/%Y %H:%M")
# "2021-01-23 16:24:00 EST"

For each column:

data_trip_cleaned$started_at <- as.POSIXct(
  data_trip_cleaned$started_at, format = "%d/%m/%Y %H:%M")
data_trip_cleaned$ended_at <- as.POSIXct(
  data_trip_cleaned$ended_at, format = "%d/%m/%Y %H:%M")

# or many columns
datetimes <- c("started_at", "ended_at")
t(lapply(df[,datetimes], FUN = function(x) as.POSIXct(x, format = "%d/%m/%Y %H:%M")))

# Then calculate difference
data_trip_cleaned$diff <- data_trip_cleaned$ended_at - data_trip_cleaned$started_at

# Alternatively
difftime(data_trip_cleaned$ended_at, data_trip_cleaned$started_at, unit = "secs")
# See ?difftime to see other options for units=
  • Related