Home > Blockchain >  Proportion Tables in R
Proportion Tables in R

Time:09-04

I have the following data in R:

gender <- c("Male","Female")

gender <- sample(gender, 5000, replace=TRUE, prob=c(0.45, 0.55))

gender <- as.factor(gender)

disease <- c("Yes","No")

disease <- sample(disease, 5000, replace=TRUE, prob=c(0.4, 0.6))

disease <- as.factor(disease)

status <- c("Immigrant","Citizen")

status <- sample(status, 5000, replace=TRUE, prob=c(0.3, 0.7))

status  <- as.factor(status )

my_data = data.frame(gender, status, disease)

I want to make a table that shows:

  • What percent of male immigrants have the disease?
  • What percent of male non-immigrants have the disease?
  • What percent of female immigrants have the disease?
  • What percent of female non-immigrants have the disease?

I tried to do this with the following code:

 t1 <- xtabs(disease ~ gender   status, data=my_data)

But I get this error:
Error in Summary.factor(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,  : 
  ‘sum’ not meaningful for factors

Can someone please show me what I am doing wrong and how to fix this?

Thank you!

CodePudding user response:

As there are more columns and all of them are factors, use count from dplyr and then get the proportions

library(dplyr)
library(tidyr)
my_data %>% 
   dplyr::count(across(everything())) %>% 
   pivot_wider(names_from = disease, values_from =n, values_fill = 0) %>% 
   group_by(gender) %>% 
   mutate(100 *across(No:Yes, proportions)) %>% 
   ungroup

-output

# A tibble: 4 × 4
  gender status       No   Yes
  <fct>  <fct>     <dbl> <dbl>
1 Female Citizen    69.4  72.4
2 Female Immigrant  30.6  27.6
3 Male   Citizen    70.4  68.7
4 Male   Immigrant  29.6  31.3

With xtabs, if we convert the column to integer, it could work as

apply(xtabs(n ~ disease   gender   status, 
  transform(my_data, n = as.integer(disease))), c(1, 2), proportions) * 100
, , gender = Female

           disease
status            No      Yes
  Citizen   69.36724 72.41993
  Immigrant 30.63276 27.58007

, , gender = Male

           disease
status            No      Yes
  Citizen   70.40185 68.68687
  Immigrant 29.59815 31.31313
  • Related