I have two dataframes, I wish to combine the unique values and update a column value alone of the existing record instead of adding new record.
DF1<-data.frame(name = c("Test1","Test2","Test3"),
Value = c(10,20,30),
Remarks = c("Good","Best","Perfect")
)
DF2<- data.frame(name = c("Test","Test2","Test3","Test4"),
Value = c(10,22,30, 40),
Remarks = c("Good","Best","Perfect","NEW"))
I have attempted
DF3 <- unique(bind_rows(x = DF1, y = DF2))
The result is
name Value Remarks
1 Test1 10 Good
2 Test2 20 Best
3 Test3 30 Perfect
4 Test 10 Good
5 Test2 22 Best
7 Test4 40 NEW
But I desire to achieve
name Value Remarks
1 Test1 10 Good
2 Test2 **22** Best
3 Test3 30 Perfect
4 Test 10 Good
5 Test4 40 NEW
For Exisiting values instead of adding new records, I wish to update.
CodePudding user response:
Here's an option using dplyr
:
library(dplyr)
DF1<-data.frame( name= c("Test1","Test2","Test3"),
Value=c(10,20,30),
Remarks =c("Good","Best","Perfect")
)
DF2<- data.frame( name= c("Test","Test2","Test3","Test4"),
Value=c(10,22,30, 40),
Remarks =c("Good","Best","Perfect","NEW"))
right_join(DF1, DF2) %>%
bind_rows(DF1 %>% filter(!name %in% DF2$name)) %>%
arrange(name)
#> Joining, by = c("name", "Value", "Remarks")
#> name Value Remarks
#> 1 Test 10 Good
#> 2 Test1 10 Good
#> 3 Test2 22 Best
#> 4 Test3 30 Perfect
#> 5 Test4 40 NEW
CodePudding user response:
You can use powerjoin
, which can handle conflicted columns when joining.
library(powerjoin)
power_full_join(DF1, DF2, by = "name", conflict = coalesce_yx)
# name Value Remarks
# 1 Test1 10 Good
# 2 Test2 22 Best
# 3 Test3 30 Perfect
# 4 Test 10 Good
# 5 Test4 40 NEW
conflict = coalesce_xy
: Columns from the first data are prior to those from the second data when conflicting.conflict = coalesce_yx
: Columns from the second data are prior to those from the first data when conflicting.