I'm brand new in R and programming in general. I have a column containing a list of dates. Some are in the "01 January 2020" format, some have only month and year (ie "January 2020" only). I want to mutate them to a new field where I add a 01 in front of all the dates that are in the month year format, and then I will use lubridate to process it into dates
This is what I've tried. I'm trying to extract the first character of the Date column. If it is an upper case letter, then I will append "01" to it. I am using the tinyverse package including dplyr
df %>% mutate(new_date = ifelse(str_sub(Date, start = 1, end = 1)== "[:upper:]"), paste('01', Date, sep = ' '), new_date = Date)
I'm getting the error message "no is missing", but I thought I have included new_date = Date
to keep the current formatting.
Thank you for your help!
CodePudding user response:
This can be done in many ways.
base R
using lookahead and backreference:
sub("(^)(?=[A-Za-z] )", "\\101 ", date, perl = TRUE)
[1] "01 January 2020" "01 January 2020" "12 February 1999" "01 March 2033"
base R
using only backreference:
sub("(^[A-Za-z] )", "01 \\1", date, perl = TRUE)
dplyr
and stringr
using the same logic:
library(dplyr)
library(stringr)
data.frame(date) %>%
mutate(date = str_replace(date, "(^)(?=[A-Za-z] )", "\\101 "))
If you do insist on using ifelse
:
library(dplyr)
library(stringr)
data.frame(date) %>%
mutate(date = ifelse(str_detect(date, "^[:upper:]"),
sub("^", "01 ", date),
date))
Data:
date <- c("01 January 2020","January 2020", "12 February 1999", "March 2033")
CodePudding user response:
Here is a non-regex option where we convert to Date
class and format
it
library(parsedate)
format(parse_date(date), '%d %B %Y')
[1] "01 January 2020" "01 January 2020" "12 February 1999" "01 March 2033"
data
date <- c("01 January 2020","January 2020", "12 February 1999", "March 2033")