Data:
df <- data.frame(
Volume = c("High", "High", "Low", "Low"),
Race = c("B", "Non-B", "B", "Non-B"),
Count = c(62, 1366, 10, 97)
) %>% as.tibble()
I want to create a table() [contingency table] using the data I have to perform a chi-squared test.
I can't figure it out though.
CodePudding user response:
library(tidyverse)
contingency.table <- df %>%
pivot_wider(names_from = Volume, values_from = Count) %>%
column_to_rownames("Race")
That creates the 2-by-2 contingency table that you want (with column- and row-name labels):
High Low
B 62 10
Non-B 1366 97
Then you can run chisq.test()
on the table for a chi-squared test:
chisq.test(contingency.table)
Pearson's Chi-squared test with Yates' continuity correction
data: .
X-squared = 4.5, df = 1, p-value = 0.03
CodePudding user response:
In the days of yore when statisticians ruled the probabilistic universe, one would not load the entire flippin tidyverse to do a crosstabs and chi-square test, but just use ordinary R functions:
df1 <- data.frame(
Volume = c("High", "High", "Low", "Low"),
Race = c("B", "Non-B", "B", "Non-B"),
Count = c(62, 1366, 10, 97) )
( xt <- xtabs( Count ~ Race Volume, data=df ))
###---
Volume
Race High Low
B 62 10
Non-B 1366 97
chisq.test(xt)
#-----------------
Pearson's Chi-squared test with Yates' continuity correction
data: xt
X-squared = 4.5124, df = 1, p-value = 0.03365
Comparing the tidyverse output to the base-R output I judge base-R to have better labeling of the columns