Home > front end >  s3 is there a way to combine prop.table for character variables?
s3 is there a way to combine prop.table for character variables?

Time:12-07

Noob here, I'm stuck trying to use S3 to summarise proportion data for a data.frame where there are four columns of character data. My goal is to build a summary method to show the proportions for every level of every variable at one time.

I can see how to get the propotion for each column

a50survey1 <- table(Student1995$alcohol)
a50survey2 <- table(Student1995$drugs)
a50survey3 <- table(Student1995$smoke)
a50survey4 <- table(Student1995$sport)
prop.table(a50survey1)
prop.table(a50survey1)

                  Not  Once or Twice a week          Once a month           Once a week More than once a week 
                 0.10                  0.32                  0.24                  0.28                  0.06 

But I cannot find a way to combine all of the prop.table outputs into one summary output. Unless I'm really wrong. I cannot find a S3 method like summary.prop.table which would work for me. The goal is to set up for the current data frame and then drop in new same size & observations data frames in the future.

I'm really a step by step guy and if you can help me, that would be great - thank you

Dataframe info here. There are four columns, where each column has a different number of catagorical options for obersvations.

> dput(head(Student1995,5))
structure(list(alcohol = structure(c(3L, 2L, 2L, 2L, 3L), .Label = c("Not", 
"Once or Twice a week", "Once a month", "Once a week", "More than once a week"
), class = "factor"), drugs = structure(c(1L, 2L, 1L, 1L, 1L), .Label = c("Not", 
"Tried once", "Occasional", "Regular"), class = "factor"), smoke = structure(c(2L, 
3L, 1L, 1L, 1L), .Label = c("Not", "Occasional", "Regular"), class = "factor"), 
    sport = structure(c(2L, 1L, 1L, 2L, 2L), .Label = c("Not regular", 
    "Regular"), class = "factor")), row.names = c(NA, 5L), class = "data.frame")

The Summary data if it helps - edit

> summary(Student1995)
                  alcohol          drugs           smoke            sport   
 Not                  : 5   Not       :36   Not       :38   Not regular:13  
 Once or Twice a week :16   Tried once: 6   Occasional: 5   Regular    :37  
 Once a month         :12   Occasional: 7   Regular   : 7                   
 Once a week          :14   Regular   : 1                                   
 More than once a week: 3 

CodePudding user response:

Maybe this is what you wanted. Values in each category sum up to 100%.

lis <- sapply( Student1995, function(x) t( sapply( x, table ) ) )

sapply( lis, function(x) colSums(prop.table(x)) )
$alcohol
                  Not  Once.or.Twice.a.week          Once.a.month
                  0.0                   0.6                   0.4
          Once.a.week More.than.once.a.week
                  0.0                   0.0

$drugs
       Not Tried.once Occasional    Regular
       0.8        0.2        0.0        0.0

$smoke
       Not Occasional    Regular
       0.6        0.2        0.2

$sport
Not.regular     Regular
        0.4         0.6

and the whole summary...

prop.table( table(as.vector( sapply( Student1995, unlist ))) )

                 Not          Not regular           Occasional
                0.35                 0.10                 0.05
        Once a month Once or Twice a week              Regular
                0.10                 0.15                 0.20
          Tried once
                0.05
  • Related