I am working with an external package that's converting columns of a dataframe with the lubridate date type Date into numeric type. (Confirmed by running as.numeric() on the columns).
I'm wondering if there's a way to convert it back?
For example, if I have the date "O1-01-2021" then running as.numeric on it returns -719143. How can I turn that back into "O1-01-2021" ?
CodePudding user response:
Note that Date class is part of base R, not lubridate.
You probably assumed that the data was year/month/day by mistake. Using base R to eliminate lubridate as a problem we can replicate the question's result like this:
as.numeric(as.Date("01-01-2021", "%Y-%m-%d"))
## [1] -719143
Had we used day/month/year we would have gotten:
as.numeric(as.Date("01-01-2021", "%d-%m-%Y"))
## [1] 18628
or using lubridate
library(lubridate)
as.numeric(dmy("01-01-2021"))
## [1] 18628
CodePudding user response:
EDIT #2: It looks like the original data is being parsed as Jan 20, year 1, which might happen if the year-month-day columns were jumbled while being parsed:
as.numeric(as.Date("01-01-2021", format = "%Y-%m-%d", origin = "1970-01-01"))
[1] -719143
as.numeric(as.Date("0001-01-20", origin = "1970-01-01"))
[1] -719143
Is there a way to share an example of the raw data as you have it? e.g. dput(MY_DATA[1:10, DATE_COL])
EDIT: -719143 is about 1970 years of days, which can't be a coincidence, given that many date/time formats use 1970 as a baseline. I wonder if 01-01-2021
is being interpreted as the numeric formula equal to -2021
and so we're looking at perhaps -2021
seconds/days/[?] before year zero, which would be about -1970 years before the epoch...
-719143/(365)
[1] -1970.255
For instance, we can get something close with:
as.numeric(as.Date("0000-01-01", origin = "1970-01-01"))
[1] -719528
R treats a string describing a date as text:
x <- "01-01-2021"
class(x)
[1] "character"
We can convert it to a Date
data type using these two equivalent commands:
base_dt <- as.Date(x, "%m-%d-%Y") # base R version
lubridt <- lubridate::mdy(x) # convenience lubridate function
identical(base_dt, lubridt)
[1] TRUE
Under the hood, a Date
object in R is a numeric value with a flag telling R it's a date:
> typeof(lubridt) # What general type of data is it?
[1] "double" # --> numeric, stored as a double
> as.numeric(lubridt)
[1] 18628
> class(lubridt) # Does it have any special class attributes?
[1] "Date" # --> yes, it's a Date
> dput(lubridt) # How would we construct it from scratch?
structure(18628, class = "Date") # --> by giving 18628 a Date attribute
In R, a Date is encoded as the number of days since 1970 began:
> as.Date("1970-01-1") as.numeric(lubridt)
[1] "2021-01-01"
We could convert it back to the original text using:
format(base_dt, "%m-%d-%Y")
[1] "01-01-2021"
identical(x, format(base_dt, "%m-%d-%Y"))
[1] TRUE
CodePudding user response:
Just use as_date
x <- lubridate::dmy("01-01-2021")
as_date(as.numeric(x))
"2021-01-01"
The value after conversion is the number of days since lubridate::origin
lubridate::origin days(as.numeric(x))
[1] "2021-01-01 UTC"
CodePudding user response:
as.numeric
does this to a date object ->
- converts date object to number of days from 1/1/1970
- With
as.Date(y, "1970-01-01")
we could turn it back to date
Here is the example:
# create date object
library(lubridate)
x <- dmy("01-01-2021")
x
[1] "2021-01-01"
#convert date object to number of days 1/1/1970
y <- as.numeric(x)
y
[1] 18628
# Back to Date from the origin
as.Date(y, "1970-01-01")
[1] "2021-01-01"