Home > OS >  Parse string with a 5-digit year to POSIXct
Parse string with a 5-digit year to POSIXct

Time:08-03

I want to parse a character with a 5-digit (or more) year, e.g. 10000-01-01 01:00:00 to class POSIXct.

A 4-digit year is of course no problem. For example,

as.POSIXct("1000-01-01  01:00:00")

But with a 5-digit year, e.g. 10000, as.POSIXct errors:

as.POSIXct("10000-01-01  01:00:00") 
# Error in as.POSIXlt.character(x, tz, ...) : 
#      character string is not in a standard unambiguous format

I wonder is there a way to handle year with more than 4 digits in R?

CodePudding user response:

There is no problem handling years with 5 digits, it's the as.POSIXct.character function that is the problem here, since it uses strptime, which can only handle years 0-9999.

The following code produces a POSIXct object of the correct date/time:

structure(253402304400, class = c("POSIXct", "POSIXt"))
#> [1] "10000-01-01 01:00:00 GMT"

If you use POSIXlt to construct the date-times, you can assign the year part numerically, then convert to POSIXct, which allows the following function to be defined. It will do the same as as.POSIXct but can handle large years:

as.bigPOSIX <- function(x) {
  y <- as.POSIXlt(sub("^\\d ", "2000", x))
  y$year <- sapply(strsplit(x, "-"), function(a) as.numeric(a[1])) - 1900
  as.POSIXct(y)
}

For example:

as.bigPOSIX(c("10000-01-01 01:00:00", "23456-03-09 12:04:01", 
               "2022-07-05 23:59:59"))
#> [1] "10000-01-01 01:00:00 GMT" "23456-03-09 12:04:01 GMT" 
#> [3] "2022-07-05 23:59:59 GMT" 
  • Related