I want to parse a character with a 5-digit (or more) year, e.g. 10000-01-01 01:00:00
to class POSIXct
.
A 4-digit year is of course no problem. For example,
as.POSIXct("1000-01-01 01:00:00")
But with a 5-digit year, e.g. 10000, as.POSIXct
errors:
as.POSIXct("10000-01-01 01:00:00")
# Error in as.POSIXlt.character(x, tz, ...) :
# character string is not in a standard unambiguous format
I wonder is there a way to handle year with more than 4 digits in R?
CodePudding user response:
There is no problem handling years with 5 digits, it's the as.POSIXct.character
function that is the problem here, since it uses strptime
, which can only handle years 0-9999.
The following code produces a POSIXct object of the correct date/time:
structure(253402304400, class = c("POSIXct", "POSIXt"))
#> [1] "10000-01-01 01:00:00 GMT"
If you use POSIXlt
to construct the date-times, you can assign the year part numerically, then convert to POSIXct, which allows the following function to be defined. It will do the same as as.POSIXct
but can handle large years:
as.bigPOSIX <- function(x) {
y <- as.POSIXlt(sub("^\\d ", "2000", x))
y$year <- sapply(strsplit(x, "-"), function(a) as.numeric(a[1])) - 1900
as.POSIXct(y)
}
For example:
as.bigPOSIX(c("10000-01-01 01:00:00", "23456-03-09 12:04:01",
"2022-07-05 23:59:59"))
#> [1] "10000-01-01 01:00:00 GMT" "23456-03-09 12:04:01 GMT"
#> [3] "2022-07-05 23:59:59 GMT"