Home > database >  Replacing character dates with Date values
Replacing character dates with Date values

Time:09-29

I have some data where the final column has dates (extracted from file name where they are stored originally), and for all intents and purposes looks like this:

Person <- c("Person A", "Person B", "Person C", "Person A", "Person B", "Person C")
School <- c("School A", "School B", "School C", "School D", "School E", "School F" )
Date <- c("Aug 2019", "Oct 2019","Sep 2019","Nov 2019","Dec 2019","Jul 2019")
df <- data.frame(Person, School, Date)

The dates come from the file name, I added the file name as a column and then use the following to extract the date value as it appears in the example above.

mutate(Date = substr(File, 44, 51))

From this point I need to convert it from a character "Aug 2019" to a date of any format, so I can create some timeseries, I'll only use month/day on the X axis anyway and there are no days specified in the data, it's all month/year.

I can see that R recognises it as a date when I parse it using

parse_date_time(data3$Date, orders = c("bY"))

and it gives the result

> parse_date_time(df$Date, orders = c("bY"))
[1] "2019-08-01 UTC" "2019-10-01 UTC" "2019-09-01 UTC" "2019-11-01 UTC" "2019-12-01 UTC" "2019-07-01 UTC"

I am unfortunately drawing a blank at getting these into the table in R, the furthest I got was the following, the mutate line being what I use to extract the date from the file name.

df2 <- head(df, 4) %>% 
  mutate(Date = substr(Date, 10, 18)) %>% 
  replace(df2$Date,  parse_date_time(df2$Date, orders = c("bY")))

However it is creating new columns instead of replacing the date in the right column. I also tried using mutate instead of replace but that didn't appear to do anything.

enter image description here

Grateful for any advice on this.

CodePudding user response:

The replace is not needed i.e. after the substr, just wrap the parse_date_time within mutate to convert the column

library(dplyr)
library(lubridate)
df2 <- df %>%
   mutate(Date = parse_date_time(substr(Date, 10, 18), orders = c("bY")))
  • Related