Home > Mobile >  Insert value from df2 into matching df1 cell if exists, append df2 rows if not exists in df1, and in
Insert value from df2 into matching df1 cell if exists, append df2 rows if not exists in df1, and in

Time:11-18

How can I replace values in df1 with var values in df2 if they match BOTH code1 and code2, and append non-matches to the bottom keeping the order of df1 AND indexing whether the line was edited:

Sample data:

#create sample df1
df1 <- data.frame(code1=c("A","B","C","E","E","A"),code2=c("C3","C1","C4","C3","C5","C3"),var1=c(1,2,3,4,5,6),var2=c("vA","vB","vC","vD","vE","vF"),ver=1,stringsAsFactors = F)

#create sample df2
df2 <- data.frame(code1=c("E","B","Z","Z"),code2=c("C3","C1","C3","C5"),var1=c(9,10,1,5),var2=c("vX","vY","vD","vD"),ver=2,stringsAsFactors = F)

I'd like a function that creates the following result df:


df_desired <- data.frame(code1=c("A","B","C","E","E","A","Z","Z"),code2=c("C3","C1","C4","C3","C5","C3","C3","C5"),var1=c(1,10,3,9,5,6,1,5),var2=c("vA","vY","vC","vX","vE","vF","vD","vD"),ver=c(1,2,1,2,1,1,2,2),stringsAsFactors = F)

CodePudding user response:

Here's one approach. I'm sure you could make some of the steps a little more elegant, but it works.

f <- function(df1, df2) {
  a <- match(df1$code1, df2$code1)
  b <- match(df1$code2, df2$code2)
  is_match <- a == b & !is.na(a) & !is.na(b)
  where_match <- a[is_match]
  
  df1[is_match,] <- df2[where_match,]
  df1$edited <- is_match
  
  bind_rows(df1, df2[1:nrow(df2) %in% where_match,])
}

f(df1, df2)
  code1 code2 var1 var2 ver edited
1     A    C3    1   vA   1  FALSE
2     B    C1   10   vY   2   TRUE
3     C    C4    3   vC   1  FALSE
4     E    C3    9   vX   2   TRUE
5     E    C5    5   vE   1  FALSE
6     A    C3    6   vF   1  FALSE
7     E    C3    9   vX   2     NA
8     B    C1   10   vY   2     NA

CodePudding user response:

We could also use the (experimental) rows_upsert from dplyr:

library(dplyr)

df1 |>
  rows_upsert(df2, by = c("code1", "code2"))

Output:

  code1 code2 var1 var2 ver
1     A    C3    1   vA   1
2     B    C1   10   vY   2
3     C    C4    3   vC   1
4     E    C3    9   vX   2
5     E    C5    5   vE   1
6     A    C3    6   vF   1
7     Z    C3    1   vD   2
8     Z    C5    5   vD   2

I guess ver is the indexing variable you need.

  • Related