Home > Software design >  Merge to sets of data and replace current value if is in update column
Merge to sets of data and replace current value if is in update column

Time:07-08

With the following data frames. I would like to create a new data frame where quantities for IDs in df2 are updated with the 'Update quantities' values and IDs are replaces with the 'Update_ID' values (which may or may not change from the current ID.

df1 <- data.frame(ID = LETTERS[1:10],Quantity = c(1,2,3,4,5,6,7,8,9,10))
df2 <- data.frame(Cur_ID = c("A","C","D","H"), Update_ID = c("A","C","X","Y"), Update_Quantity = c(20, 21, 22, 23))

desired df3:
ID  letters
A   20
B   2
C   21
X   22
E   5
F   6
G   7
Y   23
I   9
J   10

i.e. the quantities for all in df2 are updates, and the IDs if Update_ID is different from Cur_ID.

CodePudding user response:

With dplyr, you could merge df2 into df1 and use coalesce() to update ID and Quantity.

library(dplyr)

df1 %>%
  left_join(df2, by = c("ID" = "Cur_ID")) %>%
  mutate(ID = coalesce(Update_ID, ID),
         Quantity = coalesce(Update_Quantity, Quantity), .keep = "unused")

Another option with powerjoin:

library(powerjoin)

power_left_join(df1, setNames(df2, c("Cur_ID", "ID", "Quantity")),
                by = c("ID" = "Cur_ID"), conflict = coalesce_yx)
Output
#    ID Quantity
# 1   A       20
# 2   B        2
# 3   C       21
# 4   X       22
# 5   E        5
# 6   F        6
# 7   G        7
# 8   Y       23
# 9   I        9
# 10  J       10

CodePudding user response:

Using data.table join

library(data.table)
setDT(df1)[df2, c('ID', 'Quantity') := .(Update_ID, 
         Update_Quantity), on = .(ID = Cur_ID)]

-output

> df1
        ID Quantity
    <char>    <num>
 1:      A       20
 2:      B        2
 3:      C       21
 4:      X       22
 5:      E        5
 6:      F        6
 7:      G        7
 8:      Y       23
 9:      I        9
10:      J       10
  • Related