Home > other >  I have 6 matrices and I have to perform a t-test over cell values combined from the 6 matrices and s
I have 6 matrices and I have to perform a t-test over cell values combined from the 6 matrices and s

Time:06-01

Input: There are 6 input matrices of same dimensions 3 input matrices from normal tissues:

              GeneA             GeneB          
GeneA          31                  4           
GeneB           5                  8 

              GeneA             GeneB          
GeneA           5                 14           
GeneB           5                  8 


              GeneA             GeneB          
GeneA          30                 14           
GeneB           45                 7 

3 input matrices from cancer tissues:

              GeneA             GeneB          
GeneA          11                  4           
GeneB           5                  18 

              GeneA             GeneB          
GeneA           7                 14           
GeneB           15                 4 


              GeneA             GeneB          
GeneA          30                 14           
GeneB           45                 7 

output:

                  GeneA                            GeneB          
GeneA        t-test({31,5,30},{11,7,13})    t-test({4,14,14},{4,14,14})        
GeneB        t-test({5,5,45},{5,15,45})     t-test({8,8,7},{18,4,7})

Output matrix will have the p-values from the test

CodePudding user response:

The code that follows conducts the t-tests on data in a 3-dim array form. This makes it easier to loop through the data sets, extract the required vectors and run the tests.

From tabular data to arrays

normal <- mget(ls(pattern = "^normal"))
cancer <- mget(ls(pattern = "^cancer"))

anorm <- array(dim = c(dim(normal[[1]]), length(normal)))
acanc <- array(dim = c(dim(cancer[[1]]), length(cancer)))
for(i in seq_along(normal)) {
  anorm[, , i] <- unlist(normal[[i]])
  acanc[, , i] <- unlist(cancer[[i]])
}

The t-tests

Create a results list first, then run the tests, then extract the p-values into a data.frame.

t_test_list <- vector("list", length = prod(dim(anorm)[1:2]))
for(j in seq(dim(anorm)[2])) {
  for(i in seq(dim(anorm)[1])) {
    x <- anorm[i, j, ]
    y <- acanc[i, j, ]
    k <- i   dim(anorm)[1]*(j - 1)
    t_test_list[[k]] <- t.test(x, y)
  }
}

t_test_pval <- normal[[1]]
t_test_pval[] <- sapply(t_test_list, `[[`, 'p.value')

t_test_pval
#>           GeneA     GeneB
#> GeneA 0.6176386 1.0000000
#> GeneB 0.8618120 0.6850214

Created on 2022-06-01 by the reprex package (v2.0.1)


Data

x<-'              GeneA             GeneB          
GeneA          31                  4           
GeneB           5                  8'
normal1 <- read.table(textConnection(x), header = TRUE)

x<-'              GeneA             GeneB          
GeneA           5                 14           
GeneB           5                  8 '
normal2 <- read.table(textConnection(x), header = TRUE)


x<-'              GeneA             GeneB          
GeneA          30                 14           
GeneB           45                 7 '
normal3 <- read.table(textConnection(x), header = TRUE)

x<-'              GeneA             GeneB          
GeneA          11                  4           
GeneB           5                  18 '
cancer1 <- read.table(textConnection(x), header = TRUE)

x<-'              GeneA             GeneB          
GeneA           7                 14           
GeneB           15                 4 '
cancer2 <- read.table(textConnection(x), header = TRUE)


x<-'              GeneA             GeneB          
GeneA          30                 14           
GeneB           45                 7 '
cancer3 <- read.table(textConnection(x), header = TRUE)

Created on 2022-06-01 by the reprex package (v2.0.1)

CodePudding user response:

Perhaps the neatest way to this is to bind both sets of matrices into 2 x 2 x 3 arrays, then use Map to get the t tests at each of the four combinations.

Let's say your matrices are called norm1, norm2, norm3 for the normal tissue and ca1, ca2 and ca3 for the cancer tissue. Then we can do:

library(abind)

norm <- abind::abind(norm1, norm2, norm3, along = 3)
canc <- abind::abind(ca1, ca2, ca3, along = 3)

pvals <- Map(function(i, j) {
  t.test(norm[i, j, ], canc[i, j, ])$p.val
  }, i = c(1:2, 1:2), j = c(1, 1, 2, 2))

matrix(unlist(pvals), 2, dimnames = dimnames(norm1))
#>           GeneA     GeneB
#> GeneA 0.6176386 1.0000000
#> GeneB 0.8618120 0.6850214

Created on 2022-06-01 by the reprex package (v2.0.1)


Reproducible data

norm1 <- structure(c(31L, 5L, 4L, 8L), dim = c(2L, 2L), dimnames = list(
    c("GeneA", "GeneB"), c("GeneA", "GeneB")))

norm2 <- structure(c(5L, 5L, 14L, 8L), dim = c(2L, 2L), dimnames = list(
    c("GeneA", "GeneB"), c("GeneA", "GeneB")))

norm3 <- structure(c(30L, 45L, 14L, 7L), dim = c(2L, 2L), dimnames = list(
    c("GeneA", "GeneB"), c("GeneA", "GeneB")))

ca1 <- structure(c(11L, 5L, 4L, 18L), dim = c(2L, 2L), dimnames = list(
    c("GeneA", "GeneB"), c("GeneA", "GeneB")))

ca2 <- structure(c(7L, 15L, 14L, 4L), dim = c(2L, 2L), dimnames = list(
    c("GeneA", "GeneB"), c("GeneA", "GeneB")))

ca3 <- structure(c(30L, 45L, 14L, 7L), dim = c(2L, 2L), dimnames = list(
    c("GeneA", "GeneB"), c("GeneA", "GeneB")))

  • Related