Home > Enterprise >  Why are there redundant rows and columns in my contingency table?
Why are there redundant rows and columns in my contingency table?

Time:10-04

I am a novice to R, and I am currently learning contingency table. I want to create a contingency table using the data from "loans_full_schema"(from openintro) with the "application_type" and "homeownership" datas. Below is my code.

library(oibiostat)

data("loans_full_schema")

tab <- table(loans_full_schema$application_type, loans_full_schema$homeownership)
tab

And my outcome is my outcome

Yet, I want to be able to get the outcome as below wanted outcome
So my question is why are there a "Any" column and a blank row in my outcome?

CodePudding user response:

That is because there are empty levels in the data.

levels(loans_full_schema$homeownership)
#[1] ""         "ANY"      "MORTGAGE" "OWN"      "RENT"    

levels(loans_full_schema$application_type)
#[1] ""           "individual" "joint"     

You can drop them with droplevels.

loans_full_schema <- droplevels(loans_full_schema) 

table(loans_full_schema$application_type, loans_full_schema$homeownership)
            
#             MORTGAGE  OWN RENT
#  individual     3839 1170 3496
#  joint           950  183  362

You may use addmargins to add the totals.

addmargins(table(loans_full_schema$application_type, loans_full_schema$homeownership))

#             MORTGAGE   OWN  RENT   Sum
#  individual     3839  1170  3496  8505
#  joint           950   183   362  1495
#  Sum            4789  1353  3858 10000
  • Related