Home > Software design >  Chi squared test for significance
Chi squared test for significance

Time:11-20

I want to report on the difference between males and female fish in migration tactics. Would this be a chi-squared test? This is my data:

combo1 <- structure(list(Sex = c("F", "M", "F", NA, "M", "F", NA, NA, "M", 
"F", NA, "M", "F", "F", NA, "F", "F", NA, "M", "F", NA, "M", 
"F", "F", "F", "F", "F", "M", NA, "F", "M", "M", "F", "F", "M", 
"F", NA, NA, NA, NA, NA, NA, "F", "F", NA, NA, NA, "M", "F", 
"F", NA, "F", "F", NA, NA, "M", "F", NA, "F", NA, "M", NA, "F", 
NA, NA, NA, NA, "F", "M", "M", NA, "F"), Tactic = c("Migr", "Migr", 
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", 
"Migr", "Migr", "Migr", "Migr", "Migr", "OcRes", "Migr", "Migr", 
"Migr", "Migr", "Migr", "Migr", "OcRes", "Migr", "Migr", "Migr", 
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", 
"Migr", "EstRes", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", 
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", 
"Migr", "Migr", "Migr", "EstRes", "Migr", "Migr", "Migr", "EstRes", 
"Migr", "OcRes", "Migr", "EstRes", "Migr", "Migr", "Migr", "Migr", 
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr")), class = "data.frame", row.names = c(NA, 
-72L))

Table for Sex Proportions

Sex<-combo1 %>% 
    filter(!Sex%in%NA) %>% 
    droplevels() %>% 
    count(Sex,Tactic) %>% 
    group_by(Sex) %>%
    mutate(Proportion = n / sum(n)) %>%
    mutate(Tactic = factor(Tactic, levels = c("EstRes", "OcRes", "Migr"))) %>% 
    mutate(Sex = factor(Sex, levels = c("F", "M")))
Sex
Sex Tactic  n   Proportion

F   EstRes  1   0.03448276  
F   Migr    26  0.89655172  
F   OcRes   2   0.06896552  
M   Migr    15  1.00000000  

I have a ggplot that show's these results nicely, but I am not sure how to get a p-value to support these results. Would it be a chi-squared test and if so what would the script be? I have looked here https://data-flair.training/blogs/wp-content/uploads/sites/2/2018/01/R-Code.jpg and tried this script:

chisq.test(combo1$Sex, combo1$Tactic, correct=FALSE)

which produced this result:

Chi-squared approximation may be incorrect
Pearson's Chi-squared test

data:  combo1$Sex and combo1$Tactic
X-squared = 1.6653, df = 2, p-value = 0.4349. 

But I'm not sure if it is correct. Any assistance would be greatly appreciated.

CodePudding user response:

Now we can create a table and compute Chi square:

tbl <- xtabs(~Sex Tactic, combo1)
tbl
#    Tactic
# Sex EstRes Migr OcRes
#   F      1   26     2
#   M      0   15     0

Notice there are very small cell counts in 4 of the 6 cells. A standard Chi square test will report a problem:

chisq.test(tbl)
# 
#   Pearson's Chi-squared test
# 
# data:  tbl
# X-squared = 1.6653, df = 2, p-value = 0.4349
# 
# Warning message:
# In chisq.test(tbl) : Chi-squared approximation may be incorrect

This is not really a problem since the p-value is much greater than .05 so we cannot reject the null hypothesis. You can have R use Monte Carlo simulation to estimate a p-value:

chisq.test(tbl, simulate.p.value=TRUE)
# 
#   Pearson's Chi-squared test with simulated p-value (based on 2000 replicates)
# 
# data:  tbl
# X-squared = 1.6653, df = NA, p-value = 0.6922

As expected, the p-value is even larger.

  • Related