Home > Blockchain >  How to merge two data frmaes with no common column names using R
How to merge two data frmaes with no common column names using R

Time:09-17

I have two data frames as below:

df1:

col1 <- c("Snhg6", "Mt1")
col2 <- c("Lpar6", "Nufip2")
col3 <- c("Lasp1", "Pipox")
col4 <- c("Bcam", "Rhot2")
df1 <- data.frame(col1, col2, col3, col4)

#    col1   col2  col3  col4
# 1 Snhg6  Lpar6 Lasp1  Bcam
# 2   Mt1 Nufip2 Pipox Rhot2

df2:

col10 <- c("Bcam", "Rhot2", "Lpar6" , "Snhg6")
df2 <- data.frame(col10)

#   col10
# 1  Bcam
# 2 Rhot2
# 3 Lpar6
# 4 Snhg6

I want to merge them in a way to find how many of the items in each column of df1 is matching with the items of the df2. I would like to have an output as:

output_df:

#  col1  col2 col3  col4
# Snhg6 Lpar6   NA  Bcam
#    NA    NA   NA Rhot2

I tried to use the methods below but non of them gave the expected results.

library(dplyr)
output_df <- df1 %>% full_join(df2)

and

library(tidyr)
output_df <- crossing(df1,df2)

How can I get the desired output? Thanks.

CodePudding user response:

  • base
res <- df1
res[array(!unlist(res) %in% df2$col10, dim(res))] <- NA
res

#    col1  col2 col3  col4
# 1 Snhg6 Lpar6 <NA>  Bcam
# 2  <NA>  <NA> <NA> Rhot2
  • dplyr
library(dplyr)

df1 %>%
  mutate(across(everything(), ~ replace(.x, !.x %in% df2$col10, NA)))

#    col1  col2 col3  col4
# 1 Snhg6 Lpar6 <NA>  Bcam
# 2  <NA>  <NA> <NA> Rhot2

CodePudding user response:

df1_mat <- as.matrix(df1)
df1_mat[!df1_mat %in% df2$col10]  <- NA
as.data.frame(df1_mat)
#>    col1  col2 col3  col4
#> 1 Snhg6 Lpar6 <NA>  Bcam
#> 2  <NA>  <NA> <NA> Rhot2

CodePudding user response:

Another option is to use an ifelse statement with str_detect wrap in across:

library(dplyr)
library(stringr)

pattern<- paste(df2$col10, collapse = "|")

df1 %>% 
  mutate(across(everything(), ~ifelse(str_detect(., pattern), ., NA_character_)))       
   col1  col2 col3  col4
1 Snhg6 Lpar6 <NA>  Bcam
2  <NA>  <NA> <NA> Rhot2
  • Related