Home > Net >  Replace NAs based on another two data frames across multiple columns
Replace NAs based on another two data frames across multiple columns

Time:01-31

I would like to replace NAs of df1 by df2(both of two data frames are large with about 155 columns x 3966 rows) Both of two data frames are with the first column 'ID' and other column names are overlapped. Here is a part of example:

> df1
       ID col1 col2  ...... col154
1  AMM115    C    A  ......     A 
2  ADM107   NA   NA  ......      B
3  AGM041    B    C  ......     C 
4  AGM132    A   NA  ......     A 
5  AQM007   NA    A  ......     B 
6  ARM028   NA   B   ......     A-
7  ASM019    A   A   ......     NA

> df2
       ID col1 col2  ...... col154
1  ADM107   A     B  ......      B
2  AGM041    C    A  ......     B 
3  ARM028   A    B   ......     NA
4  AQM007   B     A  ......     B 

I try this answers here : Can I replace NAs when joining two data frames with dplyr? I know coalesce might be I needed. It can fills the NA from the first vector but how could I across all columns. but I can't figure out how to mutate and coalesce multiple columns like my real data expect first column. Also select suffix .x or .y at this code.

library(dplyr)
df1 %>% 
        left_join(df2, by = "fruit") %>% 
        mutate(var2 = coalesce(var2.x, var2.y)) %>% 
        select(-var2.x, -var2.y)

I want get like df3, if NAs of df1 and df2 have value , using df2 to replace NAs of df1,if not, keep df1 original values.

> df3
       ID col1 col2  ...... col154
1  AMM115    C    A  ......     A 
2  ADM107   A     B  ......      B
3  AGM041    B    C  ......     C 
4  AGM132    A   NA  ......     A 
5  AQM007   B     A  ......     B 
6  ARM028   A    B   ......     A-
7  ASM019    A   A   ......     NA

CodePudding user response:

You can do:

library(dplyr)
df3 <- df1 %>%
  rows_patch(df2, by = 'ID')
  •  Tags:  
  • r
  • Related