How do I create a table() out of a tibble() in R?-CodePudding

Data:


df <- data.frame(

  Volume = c("High", "High", "Low", "Low"),

  Race = c("B", "Non-B", "B", "Non-B"),

  Count = c(62, 1366, 10, 97)

) %>% as.tibble()

I want to create a table() [contingency table] using the data I have to perform a chi-squared test.

I can't figure it out though.

CodePudding user response：

library(tidyverse)

contingency.table <- df %>% 
    pivot_wider(names_from = Volume, values_from = Count) %>% 
    column_to_rownames("Race")

That creates the 2-by-2 contingency table that you want (with column- and row-name labels):

High Low
B 62 10
Non-B 1366 97

Then you can run chisq.test() on the table for a chi-squared test:

chisq.test(contingency.table)

Pearson's Chi-squared test with Yates' continuity correction

data: .
X-squared = 4.5, df = 1, p-value = 0.03

CodePudding user response：

In the days of yore when statisticians ruled the probabilistic universe, one would not load the entire flippin tidyverse to do a crosstabs and chi-square test, but just use ordinary R functions:

df1 <- data.frame(
  Volume = c("High", "High", "Low", "Low"),
  Race = c("B", "Non-B", "B", "Non-B"),
  Count = c(62, 1366, 10, 97) ) 
( xt <- xtabs( Count ~ Race Volume, data=df ))
###---
       Volume
Race    High  Low
  B       62   10
  Non-B 1366   97

 chisq.test(xt)
#-----------------
    Pearson's Chi-squared test with Yates' continuity correction

data:  xt
X-squared = 4.5124, df = 1, p-value = 0.03365

Comparing the tidyverse output to the base-R output I judge base-R to have better labeling of the columns