Home > OS >  compare multiple columns in a data frame with vector and add new column to the data frame indicating
compare multiple columns in a data frame with vector and add new column to the data frame indicating

Time:08-25

my_vector <- c("g1","g2","g3","g4")

Read_1 Read_2 Read_3 Read_4
g1     o      o      g1
 o     g2     o      g2
g3     g3    g3      o   

Read_1,2,3,4 are my data frames column name and i have a vector called my_vector. What i want is that take the vector and compare across the data frame and if match found report the matched column name. I tried with two columns but i don't know how compare with multiple columns

This is my expected output

Read_1 Read_2 Read_3 Read_4 Match_cloumn
g1     o     o       g1      Read_1,Read_4
o      g2    o       g2      Read_2,Read_4
g3     g3    g3      o       Read_1,Read_2,Read_2

Can someone help me with this?

CodePudding user response:

We could loop over the rows, (apply with MARGIN = 1), subset the names of the based on the 'my_vector' elements and paste

df1$Match_column <- apply(df1, 1, function(x)
       toString(names(x)[x %in% my_vector]))

-output

 df1
  Read_1 Read_2 Read_3 Read_4           Match_column
1     g1      o      o     g1         Read_1, Read_4
2      o     g2      o     g2         Read_2, Read_4
3     g3     g3     g3      o Read_1, Read_2, Read_3

data

df1 <- structure(list(Read_1 = c("g1", "o", "g3"), Read_2 = c("o", "g2", 
"g3"), Read_3 = c("o", "o", "g3"), Read_4 = c("g1", "g2", "o"
)), class = "data.frame", row.names = c(NA, -3L))
  •  Tags:  
  • r
  • Related