Home > Blockchain >  Converting differing date formats to Date class in R
Converting differing date formats to Date class in R

Time:05-03

I have a CSV dataset that contains dates written in one of two forms, either with or without a timestamp, eg: 21/4/1999 09:52:04 or just 16/1/2003 - these are all in the MM/DD/YYYY format regardless of whether there is a timestamp afterwards.

I am importing this data and need to make Boolean queries on the dates. However, importing it as either Date or DateTime renders one or the other formats as NA. For example:

library(readr)
Audit_Review <- read_csv("Audit_Review.txt", 
    col_types = cols(DigitisationDate = col_date(format = "%d/%m/%Y")))

This will import dates without a timestamp ok, but will give NAs for those with a timestamp. However, if I run the following to import it:

library(readr) 
Audit_Review <- read_csv("Audit_Review.txt",   
    col_types = cols(DigitisationDate = col_date(format = "%d/%m/%Y %H:%M:%S")))

Then the observations without a timestamp are rendered as NAs.

Given that I do not need the timestamps (just the dates will do; I don't need to be any more granular than a specific day) is there a way I can import the data or alter it once imported so they are all in the YYYY-MM-DD format?

CodePudding user response:

Import as it is and then mutate to the same date format.

# A tibble: 2 x 1
  time              
  <chr>             
1 21/4/1999 09:52:04
2 16/1/2003 

df %>%
  mutate(time = time %>% 
           as.Date(format = "%d/%m/%Y"))

# A tibble: 2 x 1
  time      
  <date>    
1 1999-04-21
2 2003-01-16

CodePudding user response:

1) Create test data with both dates and date/times, read it in and then use as.Date.

Lines <- "date\n21/4/1999 09:52:04\n16/1/2003"

DF <- read.csv(text = Lines)
DF$date <- as.Date(DF$date, "%d/%m/%Y")

DF
##         date
## 1 1999-04-21
## 2 2003-01-16

2) Alternately define a custom class.

setClass("justDate")
setAs("character", "justDate", function(from) as.Date(from, "%d/%m/%Y"))

read.csv(text = Lines, colClasses = c(date = "justDate"))
##         date
## 1 1999-04-21
## 2 2003-01-16
  • Related