I need to find whether is there an intersection between rows and col names in a matrix. Prior to counting on whether there is a match I need to split each label according to a pattern.
Thanks in advance
#ad hoc function
count_intersect <- function(x,y,symbol){
require(stringr)
x <- strsplit(x = x,split = symbol, perl = T)
y <- strsplit(x = y,split = symbol, perl = T)
result <- ifelse(length(intersect(x,y)),1,0)
return(result)
}
rows <- c("a b","a c","a")
cols <- c("a","a c","d")
#my attempt
outer(rows, cols, count_intersect(""))
#toy example
expected = matrix(data = c(1,1,1,1,1,1,0,0,0),nrow = 3, ncol = 3)
rownames(expected) <- c("a b","a c","a")
colnames(expected) <- c("a","a c","d")
a a c d
a b 1 1 0
a c 1 1 0
a 1 1 0
CodePudding user response:
count_intersect <- function(x,y,symbol){
x <- strsplit(x = x,split = symbol, perl = T)[[1]]
y <- strsplit(x = y,split = symbol, perl = T)[[1]]
result <- ifelse(length(intersect(x,y)),1,0)
return(result)
}
res <- outer(rows, cols, Vectorize(count_intersect), symbol = "")
rownames(res) <- rows
colnames(res) <- cols
res
#> a a c d
#> a b 1 1 0
#> a c 1 1 0
#> a 1 1 0
outer
takes thex
andy
argument and passes them directly into the function as a whole, not elementwise (withy
being transposed). You can work around this issue by usingVectorize
.character
is always a vector in R, which means thatstrsplit
splits the strings for each element of the function (of which you only have one) and returns a list. You thus want to use the first element only.
Otherwise, I think your attempt is pretty solid.
(One also can remove the require(stringr)
part, as strsplit
is implemented in base R)