Home > Software engineering >  Pairwise comparison table from a list R
Pairwise comparison table from a list R

Time:04-06

I have a list of 4 vectors with terms (characters). I'm looking to obtain a table with the pairwise comparison of the terms. How many are equal in each pairwise comparison?

Here is an example:

set.seed(20190708)
genes <- paste("gene",1:1000,sep="")
x <- list(
  A = sample(genes,300), 
  B = sample(genes,525), 
  C = sample(genes,440),
  D = sample(genes,350)
)

And here is what I'm looking for:

enter image description here

Those are the number of terms present in both groups.

CodePudding user response:

We may use outer if we want a symmetric matrix as output, and as.dist to present the result as just the lower triangle.

out <- outer(x, x, FUN = Vectorize(function(u, v) length(intersect(u, v))))
as.dist(out)
#>     A   B   C
#> B 151        
#> C 128 228    
#> D 133 187 150

Or if it is just pairwise comparison without the mirror duplicates

out <- combn(x, 2, FUN = function(x) length(intersect(x[[1]], x[[2]])))
names(out) <- combn(names(x), 2, FUN = paste, collapse = "_")
stack(out)[2:1]
ind values
1 A_B    151
2 A_C    128
3 A_D    133
4 B_C    228
5 B_D    187
6 C_D    150

CodePudding user response:

Here is another base R option

> crossprod(table(stack(x)))
   ind
ind   A   B   C   D
  A 300 151 128 133
  B 151 525 228 187
  C 128 228 440 150
  D 133 187 150 350

or

> as.dist(crossprod(table(stack(x))))
    A   B   C
B 151
C 128 228
D 133 187 150
  • Related