I want to report on the difference between males and female fish in migration tactics. Would this be a chi-squared test? This is my data:
combo1 <- structure(list(Sex = c("F", "M", "F", NA, "M", "F", NA, NA, "M",
"F", NA, "M", "F", "F", NA, "F", "F", NA, "M", "F", NA, "M",
"F", "F", "F", "F", "F", "M", NA, "F", "M", "M", "F", "F", "M",
"F", NA, NA, NA, NA, NA, NA, "F", "F", NA, NA, NA, "M", "F",
"F", NA, "F", "F", NA, NA, "M", "F", NA, "F", NA, "M", NA, "F",
NA, NA, NA, NA, "F", "M", "M", NA, "F"), Tactic = c("Migr", "Migr",
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr",
"Migr", "Migr", "Migr", "Migr", "Migr", "OcRes", "Migr", "Migr",
"Migr", "Migr", "Migr", "Migr", "OcRes", "Migr", "Migr", "Migr",
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr",
"Migr", "EstRes", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr",
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr", "Migr",
"Migr", "Migr", "Migr", "EstRes", "Migr", "Migr", "Migr", "EstRes",
"Migr", "OcRes", "Migr", "EstRes", "Migr", "Migr", "Migr", "Migr",
"Migr", "Migr", "Migr", "Migr", "Migr", "Migr")), class = "data.frame", row.names = c(NA,
-72L))
Table for Sex Proportions
Sex<-combo1 %>%
filter(!Sex%in%NA) %>%
droplevels() %>%
count(Sex,Tactic) %>%
group_by(Sex) %>%
mutate(Proportion = n / sum(n)) %>%
mutate(Tactic = factor(Tactic, levels = c("EstRes", "OcRes", "Migr"))) %>%
mutate(Sex = factor(Sex, levels = c("F", "M")))
Sex
Sex Tactic n Proportion
F EstRes 1 0.03448276
F Migr 26 0.89655172
F OcRes 2 0.06896552
M Migr 15 1.00000000
I have a ggplot that show's these results nicely, but I am not sure how to get a p-value to support these results. Would it be a chi-squared test and if so what would the script be? I have looked here https://data-flair.training/blogs/wp-content/uploads/sites/2/2018/01/R-Code.jpg and tried this script:
chisq.test(combo1$Sex, combo1$Tactic, correct=FALSE)
which produced this result:
Chi-squared approximation may be incorrect
Pearson's Chi-squared test
data: combo1$Sex and combo1$Tactic
X-squared = 1.6653, df = 2, p-value = 0.4349.
But I'm not sure if it is correct. Any assistance would be greatly appreciated.
CodePudding user response:
Now we can create a table and compute Chi square:
tbl <- xtabs(~Sex Tactic, combo1)
tbl
# Tactic
# Sex EstRes Migr OcRes
# F 1 26 2
# M 0 15 0
Notice there are very small cell counts in 4 of the 6 cells. A standard Chi square test will report a problem:
chisq.test(tbl)
#
# Pearson's Chi-squared test
#
# data: tbl
# X-squared = 1.6653, df = 2, p-value = 0.4349
#
# Warning message:
# In chisq.test(tbl) : Chi-squared approximation may be incorrect
This is not really a problem since the p-value is much greater than .05 so we cannot reject the null hypothesis. You can have R use Monte Carlo simulation to estimate a p-value:
chisq.test(tbl, simulate.p.value=TRUE)
#
# Pearson's Chi-squared test with simulated p-value (based on 2000 replicates)
#
# data: tbl
# X-squared = 1.6653, df = NA, p-value = 0.6922
As expected, the p-value is even larger.