Home > database >  What does calling as.numeric() do to a lubridate Date object?
What does calling as.numeric() do to a lubridate Date object?

Time:08-06

I am working with an external package that's converting columns of a dataframe with the lubridate date type Date into numeric type. (Confirmed by running as.numeric() on the columns).

I'm wondering if there's a way to convert it back?

For example, if I have the date "O1-01-2021" then running as.numeric on it returns -719143. How can I turn that back into "O1-01-2021" ?

CodePudding user response:

Note that Date class is part of base R, not lubridate.

You probably assumed that the data was year/month/day by mistake. Using base R to eliminate lubridate as a problem we can replicate the question's result like this:

as.numeric(as.Date("01-01-2021", "%Y-%m-%d"))
## [1] -719143

Had we used day/month/year we would have gotten:

as.numeric(as.Date("01-01-2021", "%d-%m-%Y"))
## [1] 18628

or using lubridate

library(lubridate)
as.numeric(dmy("01-01-2021"))
## [1] 18628

CodePudding user response:

EDIT #2: It looks like the original data is being parsed as Jan 20, year 1, which might happen if the year-month-day columns were jumbled while being parsed:

as.numeric(as.Date("01-01-2021", format = "%Y-%m-%d", origin = "1970-01-01")) 
[1] -719143
as.numeric(as.Date("0001-01-20", origin = "1970-01-01")) 
[1] -719143

Is there a way to share an example of the raw data as you have it? e.g. dput(MY_DATA[1:10, DATE_COL])


EDIT: -719143 is about 1970 years of days, which can't be a coincidence, given that many date/time formats use 1970 as a baseline. I wonder if 01-01-2021 is being interpreted as the numeric formula equal to -2021 and so we're looking at perhaps -2021 seconds/days/[?] before year zero, which would be about -1970 years before the epoch...

-719143/(365)
[1] -1970.255

For instance, we can get something close with:

as.numeric(as.Date("0000-01-01", origin = "1970-01-01"))
[1] -719528

R treats a string describing a date as text:

x <- "01-01-2021"
class(x)
[1] "character"

We can convert it to a Date data type using these two equivalent commands:

base_dt <- as.Date(x, "%m-%d-%Y")   # base R version
lubridt <- lubridate::mdy(x)        # convenience lubridate function

identical(base_dt, lubridt)
[1] TRUE

Under the hood, a Date object in R is a numeric value with a flag telling R it's a date:

> typeof(lubridt)                  # What general type of data is it? 
[1] "double"                       # --> numeric, stored as a double
> as.numeric(lubridt)
[1] 18628
 > class(lubridt)                  # Does it have any special class attributes?      
[1] "Date"                         # --> yes, it's a Date
 > dput(lubridt)                   # How would we construct it from scratch?
 structure(18628, class = "Date")  # --> by giving 18628 a Date attribute

In R, a Date is encoded as the number of days since 1970 began:

 > as.Date("1970-01-1")   as.numeric(lubridt)
 [1] "2021-01-01"

We could convert it back to the original text using:

format(base_dt, "%m-%d-%Y")
[1] "01-01-2021"
identical(x, format(base_dt, "%m-%d-%Y"))
[1] TRUE

CodePudding user response:

Just use as_date

x <- lubridate::dmy("01-01-2021")
as_date(as.numeric(x))
"2021-01-01"

The value after conversion is the number of days since lubridate::origin

lubridate::origin days(as.numeric(x))
[1] "2021-01-01 UTC"

CodePudding user response:

as.numeric does this to a date object ->

  1. converts date object to number of days from 1/1/1970
  2. With as.Date(y, "1970-01-01") we could turn it back to date

Here is the example:

# create date object
library(lubridate)
x <- dmy("01-01-2021")
x
[1] "2021-01-01"

#convert date object to number of days 1/1/1970
y <- as.numeric(x)
y
[1] 18628

# Back to Date from the origin
as.Date(y, "1970-01-01")
[1] "2021-01-01"
  • Related