I am trying to use geom_signif(), from the package ggpubr. I am attempting to use it to do a Chisq test on my data (here is a sample) :
df_test= data.frame(cat = c("A","D","R"), count1 = c(10,12,23),count2 = c(9,3,4))
I am trying to see if there are some categories that are statistically enriched in certain categories in count2, which is a subset of count1. I've not seen an example online of geom_signif being used for another test than the wilcoxon, so I don't know the correct syntax to tell it what my variables are. here is the one I attempted :
melt_count = data.table::melt(df_test, id.vars = "cat")
ggplot(melt_count, aes( x =cat,fill=variable, y=value)) geom_bar(stat = "identity", position = "dodge")
geom_signif(comparisons = list(c("A","D"),c("count1","count2")), test = "chisq.test")
Warning : Computation failed in
stat_signif()
: missing value where TRUE / FALSE is required
I know this mostly doable by doing the test outside of ggpubr but I specifically want to use it, or another ggplot-affiliated package able to do statistical test by itself.
Thanks in advance,
CodePudding user response:
The problem is that the test
argument requires a function that takes two vectors to compare, whereas chisq.test
takes a single matrix. We could therefore create a little wrapper function:
chi.test <- function(a, b) {
return(chisq.test(cbind(a, b)))
}
ggplot(melt_count, aes( x =cat,y=value))
geom_bar(aes(fill=variable), stat = "identity", position = "dodge")
geom_signif(comparisons = list(c("A","D")),
test = "chi.test")
Created on 2022-06-14 by the reprex package (v2.0.1)