I would like to perform multiple fisher.test()
for each tax
and each column (ABCB1 and ABL1 in the example below) in the data frame below.
The contingency tables should be extracted from the rows like shown below. Note that the second column in the contingency table need to be calculated by extracting the Total
column from the other column tested.
contingency example:
ABCB1 NotABCB1(Total-ABCB1)
tax1Present 1 42
tax1NotPresent 3 30
data:
structure(list(group = c("tax1Present", "tax1NotPresent", "tax2Present",
"tax2NotPresent", "tax3Present", "tax3NotPresent", "tax4Present",
"tax4NotPresent", "tax5Present", "tax5NotPresent"), ABCB1 = c(1L,
3L, 4L, 5L, 3L, 6L, 6L, 12L, 13L, 6L), ABL1 = c(24L, 24L, 12L,
53L, 1L, 5L, 0L, 0L, 242L, 0L), Total = c(43L, 33L, 23L, 70L,
9L, 15L, 7L, 19L, 300L, 10L), tax = c("tax1", "tax1", "tax2",
"tax2", "tax3", "tax3", "tax4", "tax4", "tax5", "tax5")), row.names = c(NA,
10L), class = "data.frame")
> df
group ABCB1 ABL1 Total tax
1 tax1Present 1 24 43 tax1
2 tax1NotPresent 3 24 33 tax1
3 tax2Present 4 12 23 tax2
4 tax2NotPresent 5 53 70 tax2
5 tax3Present 3 1 9 tax3
6 tax3NotPresent 6 5 15 tax3
7 tax4Present 6 0 7 tax4
8 tax4NotPresent 12 0 19 tax4
9 tax5Present 13 242 300 tax5
10 tax5NotPresent 6 0 10 tax5
CodePudding user response:
Try using apply:
# set the columns to use
columns <- c("ABCB1", "ABL1")
dat_test <- sapply( which(colnames( df ) %in% columns),
function(colx) lapply( unique( df$tax ), function(x)
fisher.test( data.frame(df[ df$tax %in% x,colx],
Total_diff=df[ df$tax %in% x, ]$Total - df[ df$tax %in% x, ][colx] )
) ) )
# set names
rownames(dat_test) <- unique( df$tax )
colnames(dat_test) <- columns
dat_test
ABCB1 ABL1
tax1 List,7 List,7
tax2 List,7 List,7
tax3 List,7 List,7
tax4 List,7 List,7
tax5 List,7 List,7
Access with e.g.:
dat_test[,"ABCB1"]
$tax1
Fisher's Exact Test for Count Data
data: data.frame(df[df$tax %in% x, colx], Total_diff = df[df$tax %in% x, ]$Total - df[df$tax %in% x, ][colx])
p-value = 0.3109
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
0.00443791 3.18701284
sample estimates:
odds ratio
0.2424665
...etc