Home > Software engineering >  How to make R take values from another column in case of missing values?
How to make R take values from another column in case of missing values?

Time:10-06

I have two tables with data with people leaving a company from different data sources. In one column for leaving date there are some missing values and in these cases I want to take the dates from the other table.

The columns in the two tables look like this:

  ID      exit1      exit2
1  1        N/A 31/01/2016
2  2 01/02/2016 01/01/2021
3  3 01/10/2010 30/09/2019
4  4        N/A 31/12/2015
5  5 01/01/2016 30/09/2020

I want my result to be like this:

  ID      exit1
1  1 31/01/2016
2  2 01/02/2016
3  3 01/10/2010
4  4 31/12/2015
5  5 01/01/2016

Does anyone know how I could do this?

Thanks!

CodePudding user response:

We can use coalesce

library(dplyr)
df1 %>%
   mutate(exit1 = na_if(exit1, "N/A")) %>%
   transmute(ID, exit1 = coalesce(exit1, exit2))

-output

 ID      exit1
1  1 31/01/2016
2  2 01/02/2016
3  3 01/10/2010
4  4 31/12/2015
5  5 01/01/2016

data

df1 <- structure(list(ID = 1:5, exit1 = c("N/A", "01/02/2016", "01/10/2010", 
"N/A", "01/01/2016"), exit2 = c("31/01/2016", "01/01/2021", "30/09/2019", 
"31/12/2015", "30/09/2020")), class = "data.frame", 
row.names = c("1", 
"2", "3", "4", "5"))

CodePudding user response:

Here is an alternative approach with an ifelse statement:

library(dplyr)
df1 %>%
  mutate(exit1 = ifelse(exit1=="N/A", exit2, exit1), .keep="unused") 
 ID      exit1
1  1 31/01/2016
2  2 01/02/2016
3  3 01/10/2010
4  4 31/12/2015
5  5 01/01/2016

CodePudding user response:

Or using basic R:

df$exit1[is.na(df$exit1)] = df$exit2[is.na(df$exit1)]

and if the value is the actual string "N/A":

df$exit1[df$exit1 == 'N/A'] = df$exit2[df$exit1 == 'N/A']

and to keep the desired columns:

df = df[,c('ID', 'exit1')]
  • Related