My data looks similar to this:
df_out <- data.frame(
"name" = c("1", "2", "3", "4", "5", "6", "7", "8"),
"Factor1"=rep(c("A","B","C"),times= c(2,1,5)),
"col2"=rep(c("A","G"),times= c(4,4)),
"col3"=rep(c("T","S"),times= c(2,6)),
"col4"=rep(c("E","D"),times= c(6,2)),
"col5"=rep(c("N","A","R"),times= c(4,2,2)),
"col6"=rep(c("B","O"),times= c(1,7)),
"col7"=rep(c("N","A","R","L"),times= c(1,3,2,2)),
"col8"=rep(c("I","V","R"),times= c(2,4,2)),
"col9"=rep(c("I","G","R"),times= c(1,6,1)),
"col9"=rep(c("F","L","N"),times= c(5,2,1)),
"col10"=rep(c("T","C","R"),times= c(3,2,3)))
df_out
I want to test if there is a difference in the occurrence of observations in columns (Col2 to Col10) according to the Factor 1 using fisher exact test. So I have to make contingency table for every column like this:
fisher<-with(df_out, table(Factor1, col2))
fisher
rstatix:: fisher_test(fisher, detailed = TRUE)
Can you please help me to make the contingency table for every column in my data set and perform multiple fisher tests at once on them and extract the p values that is only significant. Thank you
CodePudding user response:
Try by looping over the col
with map
, and perform the fisher_test
, convert to a single tibble with _dfr
, and filter
only the significant cases
library(purrr)
library(rstatix)
library(dplyr)
map_dfr(df_out[grep("col", names(df_out))],
~ fisher_test(table(df_out$Factor1, .x), detailed = TRUE), .id = 'colnm')%>%
filter(p.signif != "ns")
-output
# A tibble: 1 × 6
colnm n p method alternative p.signif
<chr> <int> <dbl> <chr> <chr> <chr>
1 col3 8 0.0357 Fisher's Exact test two.sided *