I have two dataframes as shown below:
structure(list(ID = c("a", "a", "c", "d", "a", "f"), a1 = c("2001",
"2000", "2000", "1998", "2006", "2002"), value = c("100", NA,
"200", NA, "400", "500")), class = "data.frame", row.names = c(NA,
-6L))
ID2 <- c("x","a","c","d","n","f","g","m")
a2 <- c(2000,2000,2000,2000,2000,2000,2000,2000)
value2 <- c(100, 1000, 20000, 1000, 400,500,1,1)
ID a1 value
1 a 2001 100
2 a 2000 <NA>
3 c 2000 200
4 d 1998 <NA>
5 a 2006 400
6 f 2002 500
ID2 a2 value2
1 x 2000 100
2 a 2000 1000
3 c 2000 20000
4 d 2000 1000
5 n 2000 400
6 f 2000 500
7 g 2000 1
8 m 2000 1
what I want to do: I want to replace only missing value in column value of data frame 1 by corresponding value from value2 column of dataframe 2. the year and ID should match. I tried the following code using left_join
; however, this gave me the wrong result
r <- left_join(data, data2, by=c("ID"="ID2","a1"="a2"))
Any suggetsion?
expected result
ID a1 value
1 a 2001 100
2 a 2000 **1000**
3 c 2000 200
4 d 1998 <NA>
5 a 2006 400
6 f 2002 500
CodePudding user response:
Here's a tidyverse
approach.
Note that because your a1
and value
from dataframe 1 is character
, I've also changed the corresponding columns of dataframe 2 to character
.
What you need to do is to left_join
them together, and coalesce
the value
with value2
.
library(tidyverse)
ID2 <- c("x","a","c","d","n","f","g","m")
a2 <- c("2000","2000","2000","2000","2000","2000","2000","2000")
value2 <- c("100", "1000", "20000", "1000", "400","500","1","1")
df2 <- data.frame(ID2, a2, value2)
left_join(df1, df2, by = c("ID" = "ID2", "a1" = "a2")) %>%
transmute(ID = ID, a1 = a1, value = coalesce(value, value2))
ID a1 value
1 a 2001 100
2 a 2000 1000
3 c 2000 200
4 d 1998 <NA>
5 a 2006 400
6 f 2002 500