Home > database >  R - Problem Storing lubridate dates in a dataframe
R - Problem Storing lubridate dates in a dataframe

Time:12-30

When I'm trying to store date in a DF it is getting represented as a number instead of a data format. Here is my example.

library(lubridate)

df1 = data.frame(task = c('do somthing', 'do something else', 'do something more')
                 ,Start_date = c(NA, NA, NA))

Start_date = ymd("2023-12-01")
df1$Date_Start[1] = Start_date

Start_date     #this returns a "2023-12-01"
df1$Date_Start[1]  #this returns a 19692

I want df1$Date_Start[1] to store "2023-12-01"

CodePudding user response:

This has nothing to do with lubridate, since you will get the same result here with as.Date as you do with lubridate::ymd.

When you create a column of NA values but you plan to write dates into it, you need to tell R that you intend to use it for dates by using as.Date(NA) to fill the column. Otherwise, R will convert the date to a numeric value.

library(lubridate)

df1 = data.frame(task = c('do somthing', 'do something else', 'do something more')
                 ,Date_Start = c(NA, NA, NA))

Start_date = ymd("2023-12-01")
df1$Date_Start <- as.Date(NA)
df1$Date_Start[1] <- Start_date[1]

Start_date 
#> [1] "2023-12-01"
df1$Date_Start[1] 
#> [1] "2023-12-01"

df1
#>                task Date_Start
#> 1       do somthing 2023-12-01
#> 2 do something else       <NA>
#> 3 do something more       <NA>

Created on 2022-12-29 with reprex v2.0.2

CodePudding user response:

Start_date variable was created as a logical object:

str(df1)
'data.frame':   3 obs. of  2 variables:
 $ task      : chr  "do somthing" "do something else" "do something more"
 $ Start_date: logi  NA NA NA

After you defining a date at the first element, the object was converted into numeric

str(df1)
'data.frame':   3 obs. of  2 variables:
 $ task      : chr  "do somthing" "do something else" "do something more"
 $ Start_date: num  19692 NA NA

You can create your data.frame by defining the type of Start_date variable as a Date object.

df1 = data.frame(task = c('do somthing', 'do something else', 'do something more'), Start_date = as.Date(NA))

str(df1)
'data.frame':   3 obs. of  2 variables:
 $ task      : chr  "do somthing" "do something else" "do something more"
 $ Start_date: Date, format: NA NA NA

Now you can define your date at the first element:

df1$Start_date[1] = Start_date
df1
               task Start_date
1       do somthing 2023-12-01
2 do something else       <NA>
3 do something more       <NA>

CodePudding user response:

You can specify directly your Date format:

library(lubridate)

df1 = data.frame(task = c('do somthing', 'do something else', 'do something more')
             ,Start_date = c(NA, NA, NA))

df1$Start_date[1] <- format(ymd("2023-12-01"), "%Y-%m-%d")
  • Related