I'm new to r. This is my first forum question... I'm trying to merge two datasets like these:
df1 <- data.frame(ID = letters[1:5],
x = 5:9,
y = c(NA,6,5,NA,NA))
> df1
ID x y
1 a 5 NA
2 b 6 6
3 c 7 5
4 d 8 NA
5 e 9 NA
df2 <- data.frame(ID = letters[4:8],
y = 7:11,
z = c(4,3,NA,2,1))
> df2
ID y z
1 d 7 4
2 e 8 3
3 f 9 NA
4 g 10 2
5 h 11 1
outcome should look like this:
> df3
ID x y z
1 a 5 NA NA
2 b 6 6 NA
3 c 7 5 NA
4 d 8 7 4
5 e 9 8 3
6 f NA 9 NA
7 g NA 10 2
8 h NA 11 1
I looked around in various forums but couldn't find the solution to this specific problem :/
Any suggestions are much appreciated!
CodePudding user response:
You can first do a full_join
, then use coalesce
to replace NA
with the non-missing one.
The final relocate
function is just to order the column according to your desired format.
library(tidyverse)
full_join(df1, df2, by = "ID") %>%
mutate(y = coalesce(y.x, y.y), .keep = "unused") %>%
relocate(y, .after = x)
ID x y z
1 a 5 NA NA
2 b 6 6 NA
3 c 7 5 NA
4 d 8 7 4
5 e 9 8 3
6 f NA 9 NA
7 g NA 10 2
8 h NA 11 1
CodePudding user response:
We can use {powerjoin} :
library(powerjoin)
power_full_join(df1, df2, by = "ID", conflict = coalesce_xy)
#> ID x z y
#> 1 a 5 NA NA
#> 2 b 6 NA 6
#> 3 c 7 NA 5
#> 4 d 8 4 7
#> 5 e 9 3 8
#> 6 f NA NA 9
#> 7 g NA 2 10
#> 8 h NA 1 11
Created on 2022-04-14 by the reprex package (v2.0.1)
CodePudding user response:
data.table
approach
library(data.table)
DT <- rbindlist(list(df1, df2), use.names = TRUE, fill = TRUE)
final <- DT[, lapply(.SD, max, na.rm = TRUE),keyby = ID]
# replace infinite values with NA
for (j in 1:ncol(final)) set(final, which(is.infinite(final[[j]])), j, NA)
# ID x y z
# 1: a 5 NA NA
# 2: b 6 6 NA
# 3: c 7 5 NA
# 4: d 8 7 4
# 5: e 9 8 3
# 6: f NA 9 NA
# 7: g NA 10 2
# 8: h NA 11 1
CodePudding user response:
A base R option using merge
aggregate
aggregate(
. ~ ID,
merge(df1, df2, all = TRUE),
function(x) ifelse(all(is.na(x)), NA, na.omit(x)),
na.action = na.pass
)
gives
ID y x z
1 a NA 5 NA
2 b 6 6 NA
3 c 5 7 NA
4 d 7 8 4
5 e 8 9 3
6 f 9 NA NA
7 g 10 NA 2
8 h 11 NA 1