Good morning, I can't quite grasp what I am doing wrong here, could someone assist? I am trying to convert my datetime in r but some of my dates are "Jan." or "Aug." so I get null values when I try to convert it to date/time.
My solution is to separate by delimiter, grab the month, rename the month, put the date back together, and then go from there. I can't figure out the loop though and keep getting "July".
My dataframe is separated now into July 14 2022 (df$left, df$middle, df$right)
for (month in df$left){
if (df$left == "July")
{df$month <- "July"}
else if (df$left == "Aug.")
{df$month <- "August"
if (df$left == "Sept.")
{df$month <- "September"}
if (df$left == "Oct.")
{df$month <- "October"}
if (df$left == "Nov.")
{df$month <- "November"}
if (df$left == "Dec.")
{df$month <- "Dec."}
if (df$left == "Jan.")
{df$month <- "January"}
if (df$left == "Feb.")
{df$month <- "February"}
if (df$left == "March")
{df$month <- "March"}
if (df$left == "April")
{df$month <- "April"}
if (df$left == "May")
{df$month <- "May"}
if (df$left == "June")
{df$month <- "June"}
if (df$left == "July")
{df$month <- "July"}
if (df$left == "Aug.")
{df$month <- "August"}
}
left middle right Workout.Date Activity.Type
<chr> <chr> <chr> <chr> <chr>
1 July 14, 2022 July 14, 2022 Run
CodePudding user response:
The lubridate backage is fairly clever at working out how to interpret a date. I'm using the tidyverse simply for formatting and showing the column type easily.
First, create some test data
library(lubridate)
library(tidyverse)
d <- tibble(Workout.Date=c("July 14, 2022", "Jul. 14, 2022",
"September 1, 2021", "Sept. 1, 2021"))
Now, a one line solution:
d %>% mutate(Workout.Date=mdy(Workout.Date))
Workout.Date
<date>
1 2022-07-14
2 2022-07-14
3 2021-09-01
4 2021-09-01
CodePudding user response:
You may need to perform a couple of passes to fully convert all of the entries.
Convert all of the dates with the full month name, then for the rows which returned NA attempt to convert those with a different format string.
Here is an example using base R
df <- read.table(header=TRUE, text=" Workout.Date
'Jul. 14, 2022'
'July 15, 2022'
'Jul. 16, 2022'
'Jul. 17, 2022'
'Jul 18, 2022'
'July 19, 2022'
'July 20, 2022'
'July 21, 2022'
'July 22, 2022'")
df$date <- as.Date(df$Workout.Date, format = "%B %e, %Y")
#now try a different format on the rows with NA
#abbreviated followed by .
df$date[is.na(df$date)] <- as.Date(df$Workout.Date[is.na(df$date)], format = "%b. %e, %Y")
#abbreviated no .
df$date[is.na(df$date)] <- as.Date(df$Workout.Date[is.na(df$date)], format = "%b %e, %Y")