Contingency table and Fisher exact tests for multiple columns-CodePudding

My data looks similar to this:

df_out <- data.frame(
    "name" = c("1", "2", "3", "4", "5", "6", "7", "8"),
    "Factor1"=rep(c("A","B","C"),times= c(2,1,5)),
    "col2"=rep(c("A","G"),times= c(4,4)),
    "col3"=rep(c("T","S"),times= c(2,6)),
    "col4"=rep(c("E","D"),times= c(6,2)),
    "col5"=rep(c("N","A","R"),times= c(4,2,2)),
    "col6"=rep(c("B","O"),times= c(1,7)),
    "col7"=rep(c("N","A","R","L"),times= c(1,3,2,2)),
    "col8"=rep(c("I","V","R"),times= c(2,4,2)),
    "col9"=rep(c("I","G","R"),times= c(1,6,1)),
    "col9"=rep(c("F","L","N"),times= c(5,2,1)),
    "col10"=rep(c("T","C","R"),times= c(3,2,3)))
  df_out

I want to test if there is a difference in the occurrence of observations in columns (Col2 to Col10) according to the Factor 1 using fisher exact test. So I have to make contingency table for every column like this:

 fisher<-with(df_out, table(Factor1, col2))
  fisher
  rstatix:: fisher_test(fisher, detailed = TRUE)

Can you please help me to make the contingency table for every column in my data set and perform multiple fisher tests at once on them and extract the p values that is only significant. Thank you

CodePudding user response：

Try by looping over the col with map, and perform the fisher_test, convert to a single tibble with _dfr, and filter only the significant cases

library(purrr)
library(rstatix)
library(dplyr)
map_dfr(df_out[grep("col", names(df_out))],
   ~  fisher_test(table(df_out$Factor1, .x), detailed = TRUE), .id = 'colnm')%>% 
   filter(p.signif != "ns")

-output

# A tibble: 1 × 6
  colnm     n      p method              alternative p.signif
  <chr> <int>  <dbl> <chr>               <chr>       <chr>   
1 col3      8 0.0357 Fisher's Exact test two.sided   *