Home > other >  Recode Dates with else statement being existing column value
Recode Dates with else statement being existing column value

Time:02-04

I have a dataset and am trying to recode a date column. The date column has 1900-01-01. I want to change this to missing. Otherwise, I want it to contain the original date.

in SAS I would just do something like:

if date = '1900-01-01' then cleandate = ""; else cleandate = date.

Sample Data:

df <-data.frame(
id = c(1,2,3),
date = as.Date(c("1900-01-01", "1984-01-01", "1900-01-01")
))

desired outcome:

id date
1 NA (or however R assigns missing data so it won't be included in calculations)
2 1984-01-01
3 NA (or however R assigns missing data so it won't be included in calculations)

CodePudding user response:

In R, take a copy of the existing date column, and then replace the values. Just be sure you specify the date in the YYYY-MM-DD default format.

df$cleandate <- df$date
df$cleandate[df$cleandate == '1900-01-01'] <- NA
df
#  id       date  cleandate
#1  1 1900-01-01       <NA>
#2  2 1984-01-01 1984-01-01
#3  3 1900-01-01       <NA>

Which can also be accomplished in one step using replace:

df$cleandate <- replace(df$date, df$date == '1900-01-01', NA)
df
#  id       date  cleandate
#1  1 1900-01-01       <NA>
#2  2 1984-01-01 1984-01-01
#3  3 1900-01-01       <NA>

You could of course skip the copy and just overwrite the values in date if that is preferred too:

df$date[df$date == '1900-01-01'] <- NA
df
#  id       date
#1  1       <NA>
#2  2 1984-01-01
#3  3       <NA>

CodePudding user response:

1) Create a copy of df, df2, so that we can preserve the input and then use the is.na<- replacement function. This places NA's onto the LHS where TRUE values are on the RHS.

df2 <- df
is.na(df2$date) <- df2$date == "1900-01-01"

1a) or with transform:

transform(df, date = `is.na<-`(date, date == "1900-01-01"))

2) We can use if_na in dplyr:

library(dplyr)
df %>% mutate(date = na_if(date, as.Date("1900-01-01")))
  •  Tags:  
  • r
  • Related