Home > Mobile >  How to convert a column of UTC timestamps into several different timezones?
How to convert a column of UTC timestamps into several different timezones?

Time:08-05

I have a dataset with dates stored in the DB as UTC, however, the timezone is actually different.

mydat <- data.frame(
  time_stamp=c("2022-08-01 05:00:00 UTC","2022-08-01 17:00:00 UTC","2022-08-02 22:30:00 UTC","2022-08-04 05:00:00 UTC","2022-08-05 02:00:00 UTC"),
  timezone=c("America/Chicago","America/New_York","America/Los_Angeles","America/Denver","America/New_York")
)

I want to apply the timezone to the UTC saved timestamps, over the entire column.

I looked into the with_tz function in the lubridate package, but I don't see how to reference the "timezone" column, rather than hardcoding in a value.

Such as if I try

with_tz(mydat$time_stamp, tzone = mydat$timezone)

I get the following error

Error in as.POSIXlt.POSIXct(x, tz) : invalid 'tz' value`

However, if I try

mydat$time_stamp2 <- with_tz(mydat$time_stamp,"America/New_York")

that will render a new column without issue. How can I do this just referencing column values?

CodePudding user response:

Welcome to StackOverflow. This is nice, common, and tricky problem! The following should do what you ask for:

Code

mydat <- data.frame(time_stamp=c("2022-08-01 05:00:00 UTC", 
                                 "2022-08-01 17:00:00 UTC",
                                 "2022-08-02 22:30:00 UTC",
                                 "2022-08-04 05:00:00 UTC", 
                                 "2022-08-05 02:00:00 UTC"),
                    timezone=c("America/Chicago", "America/New_York", 
                               "America/Los_Angeles", "America/Denver",
                               "America/New_York"))

mydat$utc <- anytime::utctime(mydat$time_stamp, tz="UTC")
mydat$format <- ""
for (i in seq_len(nrow(mydat)))
    mydat[i, "format"] <- strftime(mydat[i,"utc"], 
                                   "%Y-%m-%d %H:%M:%S",
                                   tz=mydat[i,"timezone"])

Output

> mydat
               time_stamp            timezone                 utc              format
1 2022-08-01 05:00:00 UTC     America/Chicago 2022-08-01 05:00:00 2022-08-01 00:00:00
2 2022-08-01 17:00:00 UTC    America/New_York 2022-08-01 17:00:00 2022-08-01 13:00:00
3 2022-08-02 22:30:00 UTC America/Los_Angeles 2022-08-02 22:30:00 2022-08-02 15:30:00
4 2022-08-04 05:00:00 UTC      America/Denver 2022-08-04 05:00:00 2022-08-03 23:00:00
5 2022-08-05 02:00:00 UTC    America/New_York 2022-08-05 02:00:00 2022-08-04 22:00:00
> 

Comment

We first parse your data as UTC, I once wrote a helper function for that in my anytime package (there are alternatives but this is how I do it...). We then need to format from the given (numeric !!) UTC representation to the give timezone. We need a loop for this as the tz argument to strftime() is not vectorized.

  • Related