I have a dataset of race and outcome either (Y,N) I want to tabulate a 2X2 table to run a chisq test for each race.
Asian 584 24
Black 1721 56
Hispanic 2400 90
White 8164 289
Once I create a table 2X2 so the first row will be Asian and second row will be non-Asian (counted from values of total - asianNo) and (total- asian yes) as the second column of that row. Then I can run a chisq test easily on each race once I repeat that process for all races. Is there an easier way to run a Chisq test for each race in my table above?
CodePudding user response:
Here is one option with map2
, where the first row is an individual race and the second row are the others, then I name each list according to the specific race.
library(tidyverse)
pull(df, V1) %>%
map2(
.,
replicate(nrow(df), df, simplify = FALSE),
.f = function(x, y)
y %>%
filter(V1 != x) %>%
summarise(across(-V1, sum)) %>%
bind_rows(filter(y, V1 == x) %>% dplyr::select(-V1), .)
) %>%
set_names(., pull(df, V1))
Output
$Asian
V2 V3
1 584 24
2 12285 435
$Black
V2 V3
1 1721 56
2 11148 403
$Hispanic
V2 V3
1 2400 90
2 10469 369
$White
V2 V3
1 8164 289
2 4705 170
Data
df <- structure(list(V1 = c("Asian", "Black", "Hispanic", "White"),
V2 = c(584L, 1721L, 2400L, 8164L), V3 = c(24L, 56L, 90L,
289L)), class = "data.frame", row.names = c(NA, -4L))
CodePudding user response:
Here is another approach. First set up the master table:
tbl <- as.matrix(df[, -1])
Sums <- matrix(colSums(tbl), nrow(tbl), 2, byrow=TRUE)
Tbl <- cbind(tbl, Sums-tbl)
row.names(Tbl) <- df[, 1]
Tbl
# Yes No Yes No
# Asian 584 24 12285 435
# Black 1721 56 11148 403
# Hispanic 2400 90 10469 369
# White 8164 289 4705 170
Now a function to create 2x2 tables from a row in Tbl
:
ChiSqTable <- function(row) {
matrix(Tbl[row, ], 2, 2, byrow=TRUE, dimnames=list(Race=c(df[row, 1],
paste("Not", df[row, 1])), Question=c("Yes", "No")))
}
Finally create Chi Square tables and run the test:
Tables <- lapply(seq(nrow(Tbl)), ChiSqTable)
names(Tables) <- df[, 1]
ChiSqStats <- lapply(Tables, chisq.test)
names(ChiSqStats) <- df[, 1]
Tables[[1]] # or Tables[["Asian"]]
# Question
# Race Yes No
# Asian 584 24
# Not Asian 12285 435
ChiSqStats[[1]]
#
# Pearson's Chi-squared test with Yates' continuity correction
#
# data: X[[i]]
# X-squared = 0.33997, df = 1, p-value = 0.5598
Access the remaining tables, statistical results by specifying the number or Race. All of the results of the Chi Square Test are saved, e.g.
ChiSqStats[[1]]$expected
# Question
# Race Yes No
# Asian 587.0612 20.93878
# Not Asian 12281.9388 438.06122
ChiSqStats[[1]]$residuals
# Question
# Race Yes No
# Asian -0.12634367 0.6689899
# Not Asian 0.02762242 -0.1462607