Home > OS >  Manipulate matrix row/col names to compute values
Manipulate matrix row/col names to compute values

Time:06-12

I need to find whether is there an intersection between rows and col names in a matrix. Prior to counting on whether there is a match I need to split each label according to a pattern.

Thanks in advance

#ad hoc function
count_intersect <- function(x,y,symbol){
  require(stringr)
  x <- strsplit(x = x,split = symbol, perl = T)
  y <- strsplit(x = y,split = symbol, perl = T)
  result <- ifelse(length(intersect(x,y)),1,0)
  return(result)
}

rows <- c("a b","a c","a")
cols <- c("a","a c","d")

#my attempt
outer(rows, cols, count_intersect(""))


#toy example      
expected = matrix(data = c(1,1,1,1,1,1,0,0,0),nrow = 3, ncol = 3)
rownames(expected) <- c("a b","a c","a")
colnames(expected) <- c("a","a c","d")

    a a c d
a b 1   1 0
a c 1   1 0
a   1   1 0


CodePudding user response:

count_intersect <- function(x,y,symbol){
  x <- strsplit(x = x,split = symbol, perl = T)[[1]]
  y <- strsplit(x = y,split = symbol, perl = T)[[1]]
  result <- ifelse(length(intersect(x,y)),1,0)
  return(result)
}

res <- outer(rows, cols, Vectorize(count_intersect), symbol = "")
rownames(res) <- rows
colnames(res) <- cols

res
#>     a a c d
#> a b 1   1 0
#> a c 1   1 0
#> a   1   1 0
  1. outer takes the x and y argument and passes them directly into the function as a whole, not elementwise (with y being transposed). You can work around this issue by using Vectorize.
  2. character is always a vector in R, which means that strsplit splits the strings for each element of the function (of which you only have one) and returns a list. You thus want to use the first element only.

Otherwise, I think your attempt is pretty solid.

(One also can remove the require(stringr) part, as strsplit is implemented in base R)

  • Related