I have two tables with people leaving a company from different data sources. The two tables have different length and not every individual, who is in one table, is also in the other one. There is a column for leaving date in both tables. In one column for leaving date there are no missing values in the table and in the other one there are some missing values. I want R to overrite the column without missing values, with the values from the other column.
The original data looks like this:
ID exit1 exit2
1 1 N/A 31/01/2016
2 2 01/02/2016 01/01/2021
3 3 01/10/2010 30/09/2019
4 4 N/A 31/12/2015
5 5 01/01/2016 30/09/2020
And I want my data to look like this:
ID exit2
1 1 31/01/2016
2 2 01/02/2016
3 3 01/10/2010
4 4 31/12/2015
5 5 01/01/2016
Can anyone help?
Thanks!
CodePudding user response:
You could use
library(dplyr)
df %>%
mutate(
exit = coalesce(na_if(exit1, "N/A"), exit2),
.keep = "unused"
)
This returns
# A tibble: 5 x 2
ID exit
<dbl> <chr>
1 1 31/01/2016
2 2 01/02/2016
3 3 01/10/2010
4 4 31/12/2015
5 5 01/01/2016
Data
structure(list(ID = c(1, 2, 3, 4, 5), exit1 = c("N/A", "01/02/2016",
"01/10/2010", "N/A", "01/01/2016"), exit2 = c("31/01/2016", "01/01/2021",
"30/09/2019", "31/12/2015", "30/09/2020")), row.names = c(NA,
-5L), class = c("tbl_df", "tbl", "data.frame"))
CodePudding user response:
An alternative with the data.table
package:
library(data.table)
DT[ ,`:=` (exit1 = NULL, exit2 = ifelse(is.na(exit1), exit2, exit1))]
DT
#> ID exit2
#> 1: 1 31/01/2016
#> 2: 2 01/02/2016
#> 3: 3 01/10/2010
#> 4: 4 31/12/2015
#> 5: 5 01/01/2016
Your data:
DT <- data.table(ID = 1:5,
exit1 = c(NA, "01/02/2016", "01/10/2010", NA, "01/01/2016"),
exit2 = c("31/01/2016", "01/01/2021", "30/09/2019", "31/12/2015", "30/09/2020"))
Created on 2021-10-05 by the reprex package (v0.3.0)