Home > Software engineering >  Converting string to dates where month and/or day is missing
Converting string to dates where month and/or day is missing

Time:02-02

I have partial dates. I want to paste 1st day if day is missing and 01JAN is day and month is missing and then convert it to date object with ymd:

aa = data.frame(a = c("2021-01", "2021"))
bb = aa %>%
  mutate(aa = case_when(nchar(a) == 7 ~ ymd(paste(a, "-01", sep = "")),
                        nchar(a) == 4 ~ ymd(paste(a, "-01-01", sep=""))))

Could you please advise me why I'm getting these warnings?

Warning messages:
1: Problem while computing `aa = case_when(...)`.
ℹ  1 failed to parse. 
2: Problem while computing `aa = case_when(...)`.
ℹ  1 failed to parse. 

I changed paste to paste0, used truncated and it didn;t work. I have to use lubridate. With base R and as.Date it works.

CodePudding user response:

The warnings have to do with the way case_when operates. It first computes all the cases for all the rows and then only returns the relevant cases. In your case you get warnings because e.g. the first case does not produce a that in the second row.

Try this instead


bb = aa %>%
  mutate(
    aa = ymd(case_when(
      nchar(a) == 7 ~ paste(a, "-01", sep = ""),
      nchar(a) == 4 ~ paste(a, "-01-01", sep ="")
)) 
)

CodePudding user response:

as.Date will ignore junk at the end so use this or apply format to that if a character vector is needed.

aa <- data.frame(a = c("2021-01", "2021"))
as.Date(paste0(aa$a, "-01-01"))
## [1] "2021-01-01" "2021-01-01"

CodePudding user response:

try doing it outside of the case when. The reason why your example is failing is that ymd() wants to work on the whole column of characters instead of part of it. case_when() returns the values when it is true and FALSE otherwise. when using ymd() inside the case_when() it sees the values that meet the condition and then also FALSE values when it does not meet the condition. Try this to reproduce your error.

ymd("2020-01-01",FALSE)

As mentioned by @Cettt , as.Date() just isn't producing a warning but the results are the same.

 library(tidyverse)
library(lubridate)
aa = data.frame(a = c("2021-01", "2021"))
bb = aa %>%
  mutate(aa = case_when(nchar(a) == 7 ~ paste(a, "-01", sep = ""),
                        nchar(a) == 4 ~ paste(a, "-01-01", sep="")),
         aa = ymd(aa))
  • Related