Home > front end >  Use `dput` for POSIXct dates in r: why are they in a format like, e.g., 1070236800 instead of 2003-1
Use `dput` for POSIXct dates in r: why are they in a format like, e.g., 1070236800 instead of 2003-1

Time:11-09

I have the following array of POSIXct dates

>x
[1] "2003-12-01 UTC" "2003-12-02 UTC" "2003-12-03 UTC" "2003-12-04 UTC" "2003-12-05 UTC" "2003-12-08 UTC"

[7] "2003-12-09 UTC" "2003-12-10 UTC" "2003-12-11 UTC" "2003-12-12 UTC"

whose structure is:

str(x)
 POSIXct[1:10], format: "2003-12-01" "2003-12-02" "2003-12-03" "2003-12-04" "2003-12-05" "2003-12-08" "2003-12-09 ..."

Anyway, when I use dput, I obtain:

structure(c(1070236800, 1070323200, 1070409600, 1070496000, 1070582400, 
1070841600, 1070928000, 1071014400, 1071100800, 1071187200), class = c("POSIXct", 
"POSIXt"), tzone = "UTC")

CodePudding user response:

POSIXct is stored as a numeric value representing the number of seconds since midnight on 1st January 1970 UTC. Note that if we write the same structure manually with the numeric value set to 0, we get:

structure(0, class = c("POSIXct", "POSIXt"), tzone = "UTC")
#> [1] "1970-01-01 UTC"

We can confirm that POSIXct is stored as a double precision floating point number

x <- Sys.time()

x
#> [1] "2022-11-08 11:33:36 GMT"

class(x)
#> [1] "POSIXct" "POSIXt" 

typeof(x)
#> [1] "double"

The reason why it is stored as a number is because we need to be able to work on date-times arithmetically. If we subtract numbers from a POSIXct object we are subtracting seconds:

x - 3600
#> [1] "2022-11-08 10:33:36 GMT"

If it were stored as a character string, than any time we wanted to perform calculations on date-times or plot them, we would have to parse the character strings into a numerical value, do the calculations, then rewrite the character strings. This is obviously much less efficient than having an underlying numerical representation that uses a special print method to represent the number as a date-time.

CodePudding user response:

POSIXct converts date-time with an associated time zone. The number you are seeing i.e. 1070236800 is the number of seconds from 1 January 1970, you'll notice if you have a date before this it will be negative. For example,

date <- as.POSIXct("1969-12-31",tz="UTC",format="%Y-%m-%d")
dput(date)

Gives

structure(-86400, class = c("POSIXct", "POSIXt"), tzone = "UTC")

Since 1969 is before 1970 there is a negative number of seconds and the reason for 86400 is I selected 1 day before 1 January 1970 and there are 86400 seconds in a day

So you'll notice that if I type the seconds from your first element and convert it, it gives the date you initially had

as.POSIXct(1070236800, origin = "1970-01-01", tz = "UTC")    

Yields

[1] "2003-12-01 UTC"

Storing it this way speeds up computation, processing and conversion to other formats

  • Related