Home > Software design >  Update Entire column in a data frame with Values of another values based on conditions
Update Entire column in a data frame with Values of another values based on conditions

Time:12-30

I have two dataframes, I wish to combine the unique values and update a column value alone of the existing record instead of adding new record.

DF1<-data.frame(name = c("Test1","Test2","Test3"),
                Value = c(10,20,30),
                Remarks = c("Good","Best","Perfect")
)
DF2<- data.frame(name = c("Test","Test2","Test3","Test4"),
                 Value = c(10,22,30, 40),
                 Remarks = c("Good","Best","Perfect","NEW"))

I have attempted

DF3 <- unique(bind_rows(x = DF1, y = DF2))

The result is

   name Value Remarks
1 Test1    10    Good
2 Test2    20    Best
3 Test3    30 Perfect
4  Test    10    Good
5 Test2    22    Best
7 Test4    40     NEW

But I desire to achieve

   name Value Remarks
1 Test1    10    Good
2 Test2    **22**    Best
3 Test3    30 Perfect
4  Test    10    Good
5 Test4    40     NEW

For Exisiting values instead of adding new records, I wish to update.

CodePudding user response:

Here's an option using dplyr:

library(dplyr)

DF1<-data.frame( name= c("Test1","Test2","Test3"),
                 Value=c(10,20,30),
                 Remarks =c("Good","Best","Perfect")
)

DF2<- data.frame( name= c("Test","Test2","Test3","Test4"),
                  Value=c(10,22,30, 40),
                  Remarks =c("Good","Best","Perfect","NEW"))

right_join(DF1, DF2) %>% 
  bind_rows(DF1 %>% filter(!name %in% DF2$name)) %>% 
  arrange(name)
#> Joining, by = c("name", "Value", "Remarks")
#>    name Value Remarks
#> 1  Test    10    Good
#> 2 Test1    10    Good
#> 3 Test2    22    Best
#> 4 Test3    30 Perfect
#> 5 Test4    40     NEW

CodePudding user response:

You can use powerjoin, which can handle conflicted columns when joining.

library(powerjoin)

power_full_join(DF1, DF2, by = "name", conflict = coalesce_yx)

#    name Value Remarks
# 1 Test1    10    Good
# 2 Test2    22    Best
# 3 Test3    30 Perfect
# 4  Test    10    Good
# 5 Test4    40     NEW
  • conflict = coalesce_xy: Columns from the first data are prior to those from the second data when conflicting.
  • conflict = coalesce_yx: Columns from the second data are prior to those from the first data when conflicting.
  • Related