Proportion Tables in R-CodePudding

I have the following data in R:

gender <- c("Male","Female")

gender <- sample(gender, 5000, replace=TRUE, prob=c(0.45, 0.55))

gender <- as.factor(gender)

disease <- c("Yes","No")

disease <- sample(disease, 5000, replace=TRUE, prob=c(0.4, 0.6))

disease <- as.factor(disease)

status <- c("Immigrant","Citizen")

status <- sample(status, 5000, replace=TRUE, prob=c(0.3, 0.7))

status  <- as.factor(status )

my_data = data.frame(gender, status, disease)

I want to make a table that shows:

What percent of male immigrants have the disease?
What percent of male non-immigrants have the disease?
What percent of female immigrants have the disease?
What percent of female non-immigrants have the disease?

I tried to do this with the following code:

 t1 <- xtabs(disease ~ gender   status, data=my_data)

But I get this error:
Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,  : 
  ‘sum’ not meaningful for factors

Can someone please show me what I am doing wrong and how to fix this?

Thank you!

CodePudding user response：

As there are more columns and all of them are factors, use count from dplyr and then get the proportions

library(dplyr)
library(tidyr)
my_data %>% 
   dplyr::count(across(everything())) %>% 
   pivot_wider(names_from = disease, values_from =n, values_fill = 0) %>% 
   group_by(gender) %>% 
   mutate(100 *across(No:Yes, proportions)) %>% 
   ungroup

-output

# A tibble: 4 × 4
  gender status       No   Yes
  <fct>  <fct>     <dbl> <dbl>
1 Female Citizen    69.4  72.4
2 Female Immigrant  30.6  27.6
3 Male   Citizen    70.4  68.7
4 Male   Immigrant  29.6  31.3

With xtabs, if we convert the column to integer, it could work as

apply(xtabs(n ~ disease   gender   status, 
  transform(my_data, n = as.integer(disease))), c(1, 2), proportions) * 100
, , gender = Female

           disease
status            No      Yes
  Citizen   69.36724 72.41993
  Immigrant 30.63276 27.58007

, , gender = Male

           disease
status            No      Yes
  Citizen   70.40185 68.68687
  Immigrant 29.59815 31.31313