I have a dataframe 'df1' that looks like:
Number | Variable1 | Variable | Variable3 |
---|---|---|---|
1 | A | B | C |
2 | A | B | C |
3 | A | B | C |
4 | A | B | C |
5 | A | B | C |
And I have a second dataframe 'df2' that looks like:
Number | Variable1 | Variable | Variable3 |
---|---|---|---|
1 | D | E | F |
2 | G | H | I |
3 | J | K | L |
4 | M | N | O |
15 | P | Q | R |
I want to update the three Variable columns in df1 with the data in the Variable columns in df2 based on matching values in Number so that df1 ends up looking like:
Number | Variable1 | Variable | Variable3 |
---|---|---|---|
1 | D | E | F |
2 | G | H | I |
3 | J | K | L |
4 | M | N | O |
5 | A | B | C |
CodePudding user response:
You could use a power_left_join
from powerjoin
package with conflict = coalesce_yx
like this:
library(powerjoin)
power_left_join(df1, df2, by = "Number", conflict = coalesce_yx)
#> Number Variable1 Variable Variable3
#> 1 1 D E F
#> 2 2 G H I
#> 3 3 J K L
#> 4 4 M N O
#> 5 5 A B C
Created on 2022-12-13 with reprex v2.0.2
Data:
df1 <- read.table(text = 'Number Variable1 Variable Variable3
1 A B C
2 A B C
3 A B C
4 A B C
5 A B C
', header = TRUE)
df2 <- read.table(text = 'Number Variable1 Variable Variable3
1 D E F
2 G H I
3 J K L
4 M N O
15 P Q R
', header = TRUE)
CodePudding user response:
Would be helpful if dput(df) done. Have created another dataset for replication
df1<-cbind.data.frame(id=c(1:5),var1=rep("A",5),var2=rep("B",5),var3=rep("C",5))
df2<-cbind.data.frame(id=c(1:4,15),var1=LETTERS[7:11],var2=LETTERS[12:16],var3=LETTERS[16:20])
df1 %>%
left_join(df2, by = "id") %>%
mutate(var1 = coalesce(var1.y, var1.x),
var2 = coalesce(var2.y, var2.x),
var3 = coalesce(var3.y, var3.x)) %>%
select(-var1.y, -var1.x,
-var2.y, -var2.x,
-var3.y, -var3.x)
CodePudding user response:
With dplyr
, we can use rows_update
library(dplyr)
rows_update(df1, df2, by = 'Number', unmatched = "ignore")
-output
Number Variable1 Variable Variable3
1 1 D E F
2 2 G H I
3 3 J K L
4 4 M N O
5 5 A B C
CodePudding user response:
You could update df1 while joining using data.table package and fcoalesce
function:
library(data.table)
cols = c("Variable1", "Variable", "Variable3")
setDT(df1)[df2, (cols) := Map(fcoalesce, mget(paste0("i.", cols)), mget(cols)), on="Number"]
Number Variable1 Variable Variable3
<int> <char> <char> <char>
1: 1 D E F
2: 2 G H I
3: 3 J K L
4: 4 M N O
5: 5 A B C