I have a dataframe 'df1' with a lot of columns, but the ones of interest are:
Number | Code |
---|---|
1 | |
2 | |
3 | |
10 | |
11 | AMRO |
4 | |
277 | |
2100 | BLPH |
And I have another dataframe 'df2' with a lot of columns, but the ones of interest are:
Number | Code |
---|---|
1 | AMCR |
2 | AMCR |
3 | BANO |
10 | BAEA |
12 | AMRO |
4 | NA |
277 | NA |
2100 | NA |
I want matching values in the 'Number' columns of 'df1' and 'df2' to lead to values in the 'Code' column in 'df2' to overwrite the 'Code' values in 'df1' as long as the 'Code' values in 'df2' don't contain an NA, so that the final result of 'df1' looks like:
Number | Code |
---|---|
1 | AMCR |
2 | AMCR |
3 | BANO |
10 | BAEA |
11 | AMRO |
4 | |
277 | |
2100 | BLPH |
Thank you for your help!
CodePudding user response:
We can do
library(powerjoin)
power_left_join(df1, df2, by = "Number", conflict = coalesce)
-output
Number Code
1 1 AMCR
2 2 AMCR
3 3 BANO
4 10 BAEA
5 11 AMRO
6 4 <NA>
7 277 <NA>
8 2100 BLPH
Or to do an overwrite, use data.table
library(data.table)
setDT(df1)[df2, Code := fcoalesce(Code, i.Code), on = .(Number)]
-output
> df1
Number Code
<int> <char>
1: 1 AMCR
2: 2 AMCR
3: 3 BANO
4: 10 BAEA
5: 11 AMRO
6: 4 <NA>
7: 277 <NA>
8: 2100 BLPH
data
df1 <- structure(list(Number = c(1L, 2L, 3L, 10L, 11L, 4L, 277L, 2100L
), Code = c(NA, NA, NA, NA, "AMRO", NA, NA, "BLPH")),
class = "data.frame", row.names = c(NA,
-8L))
df2 <- structure(list(Number = c(1L, 2L, 3L, 10L, 12L, 4L, 277L, 2100L
), Code = c("AMCR", "AMCR", "BANO", "BAEA", "AMRO", NA, NA, NA
)), class = "data.frame", row.names = c(NA, -8L))
CodePudding user response:
Here is an alternative approach using bind_cols
:
library(dplyr)
bind_cols(df1, df2) %>%
mutate(Code = coalesce(Code...2, Code...4)) %>%
select(Number = Number...1, Code)
Number Code
1 1 AMCR
2 2 AMCR
3 3 BANO
4 10 BAEA
5 11 AMRO
6 4 <NA>
7 277 <NA>
8 2100 BLPH
CodePudding user response:
Here is a solution playing with dplyr
full_join
and inner_join
library(dplyr)
df1 %>%
full_join(df2) %>% na.omit() %>%
full_join(df1 %>% inner_join(df2)) %>%
filter(Number %in% df1$Number) %>%
arrange(Number)
Output
#> Number Code
#> 1 1 AMCR
#> 2 2 AMCR
#> 3 3 BANO
#> 4 4 <NA>
#> 5 10 BAEA
#> 6 11 AMRO
#> 7 277 <NA>
#> 8 2100 BLPH