I have two datasets with similar variables. dataset2 has values of of variables that were not captured in dataset2. My aim is to use the dataset2 variables to fill the corresponding values in variables in dataset1. Is there a way to achieve this. It is possible to use coalesce but listing all the variables is a bit cumbersome.
library(dplyr)
dat1 <- tibble(
id = c("soo1", "soo2", "soo3", "soo4"),
a1= c("Test", "Tested", "Testing", NA),
a2= c("Math", "Eng", NA, "French"),
a3= c("Science", NA, "Biology", "Chem"))
dat2 <- tibble(
id = c("soo1", "soo2", "soo3", "soo4"),
a1= c(NA, NA, NA, "Tested"),
a2= c("Math", NA, "UK", NA),
a3= c("Science", "Physic", NA, NA))
dat1 %>%
inner_join(dat2, by = "id") %>%
mutate(a1 = coalesce(a1.x, a1.y),
a2 = coalesce(a2.x, a2.y))
CodePudding user response:
You could also fill
your values "downup" per group for every column like this:
library(dplyr)
library(tidyr)
dat1 <- tibble(
id = c("soo1", "soo2", "soo3", "soo4"),
a1= c("Test", "Tested", "Testing", NA),
a2= c("Math", "Eng", NA, "French"),
a3= c("Science", NA, "Biology", "Chem"))
dat2 <- tibble(
id = c("soo1", "soo2", "soo3", "soo4"),
a1= c(NA, NA, NA, "Tested"),
a2= c("Math", NA, "UK", NA),
a3= c("Science", "Physic", NA, NA))
dat1 %>%
bind_rows(dat2) %>%
group_by(id) %>%
fill(everything(), .direction = "downup") %>%
slice(1)
#> # A tibble: 4 × 4
#> # Groups: id [4]
#> id a1 a2 a3
#> <chr> <chr> <chr> <chr>
#> 1 soo1 Test Math Science
#> 2 soo2 Tested Eng Physic
#> 3 soo3 Testing UK Biology
#> 4 soo4 Tested French Chem
Created on 2022-07-18 by the reprex package (v2.0.1)
CodePudding user response:
Another possible solution, based on powerjoin
:
library(powerjoin)
library(tibble)
power_inner_join(dat1, dat2, by = "id", conflict = coalesce_xy)
#> # A tibble: 4 × 4
#> id a1 a2 a3
#> <chr> <chr> <chr> <chr>
#> 1 soo1 Test Math Science
#> 2 soo2 Tested Eng Physic
#> 3 soo3 Testing UK Biology
#> 4 soo4 Tested French Chem