Noob here, I'm stuck trying to use S3 to summarise proportion data for a data.frame where there are four columns of character data. My goal is to build a summary method to show the proportions for every level of every variable at one time.
I can see how to get the propotion for each column
a50survey1 <- table(Student1995$alcohol)
a50survey2 <- table(Student1995$drugs)
a50survey3 <- table(Student1995$smoke)
a50survey4 <- table(Student1995$sport)
prop.table(a50survey1)
prop.table(a50survey1)
Not Once or Twice a week Once a month Once a week More than once a week
0.10 0.32 0.24 0.28 0.06
But I cannot find a way to combine all of the prop.table
outputs into one summary output.
Unless I'm really wrong. I cannot find a S3 method like summary.prop.table
which would work for me. The goal is to set up for the current data frame and then drop in new same size & observations data frames in the future.
I'm really a step by step guy and if you can help me, that would be great - thank you
Dataframe info here. There are four columns, where each column has a different number of catagorical options for obersvations.
> dput(head(Student1995,5))
structure(list(alcohol = structure(c(3L, 2L, 2L, 2L, 3L), .Label = c("Not",
"Once or Twice a week", "Once a month", "Once a week", "More than once a week"
), class = "factor"), drugs = structure(c(1L, 2L, 1L, 1L, 1L), .Label = c("Not",
"Tried once", "Occasional", "Regular"), class = "factor"), smoke = structure(c(2L,
3L, 1L, 1L, 1L), .Label = c("Not", "Occasional", "Regular"), class = "factor"),
sport = structure(c(2L, 1L, 1L, 2L, 2L), .Label = c("Not regular",
"Regular"), class = "factor")), row.names = c(NA, 5L), class = "data.frame")
The Summary data if it helps - edit
> summary(Student1995)
alcohol drugs smoke sport
Not : 5 Not :36 Not :38 Not regular:13
Once or Twice a week :16 Tried once: 6 Occasional: 5 Regular :37
Once a month :12 Occasional: 7 Regular : 7
Once a week :14 Regular : 1
More than once a week: 3
CodePudding user response:
Maybe this is what you wanted. Values in each category sum up to 100%.
lis <- sapply( Student1995, function(x) t( sapply( x, table ) ) )
sapply( lis, function(x) colSums(prop.table(x)) )
$alcohol
Not Once.or.Twice.a.week Once.a.month
0.0 0.6 0.4
Once.a.week More.than.once.a.week
0.0 0.0
$drugs
Not Tried.once Occasional Regular
0.8 0.2 0.0 0.0
$smoke
Not Occasional Regular
0.6 0.2 0.2
$sport
Not.regular Regular
0.4 0.6
and the whole summary...
prop.table( table(as.vector( sapply( Student1995, unlist ))) )
Not Not regular Occasional
0.35 0.10 0.05
Once a month Once or Twice a week Regular
0.10 0.15 0.20
Tried once
0.05