I have a dataframe:
frequencies <- data.frame(row.names = c("a", "b", "c")
,response = c(10, 7, 4)
,no_response = c(12, 12, 7))
> frequencies
response no_response
a 10 12
b 7 12
c 4 7
I would like to perform Fisher's exact test, comparing each row to the sum of observations from this experiment (i.e. to the frequencies observed for the whole experiment - I want to know whether frequencies observed in any of the a/b/c data subsets are different from those observed for the whole dataset).
To do it "manually", I count how many observations I have in each column:
total <- colSums(frequencies) %>%
t() %>%
as.data.frame() %>%
`rownames<-`("total")
> total
response no_response
total 21 31
I then run fisher.test()
(from which I only need the p value), comparing each column to total[1,]
ap <- fisher.test(rbind(total[1,], frequencies[1,]))$p.value
bp <- fisher.test(rbind(total[1,], frequencies[2,]))$p.value
and so on.
There must be a neater way. In the final output, I would like to have a column in the frequencies
dataframe that contains the p values, looking like this:
response no_response pval
a 10 12 0.8
b 7 12 1
c 4 7 1
I added a purrr
tag, because I feel I should be using map
here but I don't know how to go about it.
CodePudding user response:
You can try something simple like this with dplyr:
library(dplyr)
total <- frequencies %>%
summarise(across(everything(), sum))
frequencies %>%
rowwise() %>%
mutate(pval = stats::fisher.test(rbind(total, c(response, no_response)))$p.value) %>%
ungroup()
CodePudding user response:
base:
using a for loop ::
frequencies$p.value<-0
for(i in 1:nrow(frequencies)){
frequencies$p.value[i]<- fisher.test(rbind(total[1,], frequencies[i,1:2]))$p.value
}
or using apply::
rbindtest <- function(x) {
fisher.test(rbind(total[1,], x))$p.value
}
frequencies$p.value<-apply(frequencies, 1, rbindtest)