I have two tables with data with people leaving a company from different data sources. In one column for leaving date there are some missing values and in these cases I want to take the dates from the other table.
The columns in the two tables look like this:
ID exit1 exit2
1 1 N/A 31/01/2016
2 2 01/02/2016 01/01/2021
3 3 01/10/2010 30/09/2019
4 4 N/A 31/12/2015
5 5 01/01/2016 30/09/2020
I want my result to be like this:
ID exit1
1 1 31/01/2016
2 2 01/02/2016
3 3 01/10/2010
4 4 31/12/2015
5 5 01/01/2016
Does anyone know how I could do this?
Thanks!
CodePudding user response:
We can use coalesce
library(dplyr)
df1 %>%
mutate(exit1 = na_if(exit1, "N/A")) %>%
transmute(ID, exit1 = coalesce(exit1, exit2))
-output
ID exit1
1 1 31/01/2016
2 2 01/02/2016
3 3 01/10/2010
4 4 31/12/2015
5 5 01/01/2016
data
df1 <- structure(list(ID = 1:5, exit1 = c("N/A", "01/02/2016", "01/10/2010",
"N/A", "01/01/2016"), exit2 = c("31/01/2016", "01/01/2021", "30/09/2019",
"31/12/2015", "30/09/2020")), class = "data.frame",
row.names = c("1",
"2", "3", "4", "5"))
CodePudding user response:
Here is an alternative approach with an ifelse
statement:
library(dplyr)
df1 %>%
mutate(exit1 = ifelse(exit1=="N/A", exit2, exit1), .keep="unused")
ID exit1
1 1 31/01/2016
2 2 01/02/2016
3 3 01/10/2010
4 4 31/12/2015
5 5 01/01/2016
CodePudding user response:
Or using basic R
:
df$exit1[is.na(df$exit1)] = df$exit2[is.na(df$exit1)]
and if the value is the actual string "N/A"
:
df$exit1[df$exit1 == 'N/A'] = df$exit2[df$exit1 == 'N/A']
and to keep the desired columns:
df = df[,c('ID', 'exit1')]