Home > OS >  perform Fisher test comparing multiple dataframe columns to the same vector R
perform Fisher test comparing multiple dataframe columns to the same vector R

Time:07-22

I have a dataframe:

frequencies <- data.frame(row.names = c("a", "b", "c")
                          ,response = c(10, 7, 4)
                          ,no_response = c(12, 12, 7))

> frequencies
  response no_response
a       10          12
b        7          12
c        4           7

I would like to perform Fisher's exact test, comparing each row to the sum of observations from this experiment (i.e. to the frequencies observed for the whole experiment - I want to know whether frequencies observed in any of the a/b/c data subsets are different from those observed for the whole dataset).

To do it "manually", I count how many observations I have in each column:

total <- colSums(frequencies) %>% 
  t() %>% 
  as.data.frame() %>% 
  `rownames<-`("total")

> total
      response no_response
total       21          31

I then run fisher.test() (from which I only need the p value), comparing each column to total[1,]

ap <- fisher.test(rbind(total[1,], frequencies[1,]))$p.value
bp <- fisher.test(rbind(total[1,], frequencies[2,]))$p.value

and so on.

There must be a neater way. In the final output, I would like to have a column in the frequencies dataframe that contains the p values, looking like this:

  response no_response  pval
a       10          12   0.8
b        7          12     1
c        4           7     1

I added a purrr tag, because I feel I should be using map here but I don't know how to go about it.

CodePudding user response:

You can try something simple like this with dplyr:

library(dplyr)

total <- frequencies %>%
  summarise(across(everything(), sum))

frequencies %>%
  rowwise() %>%
  mutate(pval = stats::fisher.test(rbind(total, c(response, no_response)))$p.value) %>%
  ungroup()

CodePudding user response:

base:

using a for loop ::

frequencies$p.value<-0
for(i in 1:nrow(frequencies)){
  frequencies$p.value[i]<- fisher.test(rbind(total[1,], frequencies[i,1:2]))$p.value
}
or using apply::
rbindtest <- function(x) {
  fisher.test(rbind(total[1,], x))$p.value
}
frequencies$p.value<-apply(frequencies, 1, rbindtest)

  • Related