Home > front end >  Converting irregular dates in R and the Tidyverse
Converting irregular dates in R and the Tidyverse

Time:03-09

I have a series of dates as follows

25 September 2019
27 April 2020
1994
28 February 2021
1986

Now I want to convert the 1994 and 1996 to:

01 January 1994
01 January 1986

Other full dates should be left as they are.

Any help is appreciated especially using the tidyverse way.

CodePudding user response:

Given some vector d of dates and years:

> d
[1] "25 September 2019" "27 April 2020"     "1994"             
[4] "28 February 2021"  "1986"             

Replace any entries with only 4 letters with those four letters with "01 January" pasted in front:

> d[nchar(d)==4] = paste0("01 January ",d[nchar(d)==4])

Giving:

> d
[1] "25 September 2019" "27 April 2020"     "01 January 1994"  
[4] "28 February 2021"  "01 January 1986"  

CodePudding user response:

A regex solution, which identifies the "only-year" values using the anchors ^ (for string start position) and $ (for string end position) as well as backreference \\1 to recollect the "only-year" values:

library(dplyr)
df %>%
  mutate(dates = sub("^(\\d{4})$", "01 January \\1", dates))
              dates
1 25 September 2019
2     27 April 2020
3   01 January 1994
4  28 February 2021
5   01 January 1986

base R:

df$dates <- sub("^(\\d{4})$", "01 January \\1", df$dates)

Data:

df <- data.frame(
  dates = c("25 September 2019",
            "27 April 2020",
            "1994",
            "28 February 2021",
            "1986")
)
  • Related