how can I aggregate the tables without merging them into R-CodePudding

I have:

a number m of categorical features (x1, x2, ... xm)
1 categorical feature (y)
all in a dataframe (df).

I would like have a function that give a single table with all the crossings between xi and y: for example

table1 = table (df $ x1, df $ y) ... tablem = table (df $ xm, df $ y)
aggregate tables with rbind

I'm almost there but it doesn't work.

CodePudding user response：

How about this:

data(diamonds, package="ggplot2")
tabs <- lapply(diamonds[,c("color", "clarity")], \(x){
  table(x, diamonds$cut)
})

do.call(rbind,tabs)
#>      Fair Good Very Good Premium Ideal
#> D     163  662      1513    1603  2834
#> E     224  933      2400    2337  3903
#> F     312  909      2164    2331  3826
#> G     314  871      2299    2924  4884
#> H     303  702      1824    2360  3115
#> I     175  522      1204    1428  2093
#> J     119  307       678     808   896
#> I1    210   96        84     205   146
#> SI2   466 1081      2100    2949  2598
#> SI1   408 1560      3240    3575  4282
#> VS2   261  978      2591    3357  5071
#> VS1   170  648      1775    1989  3589
#> VVS2   69  286      1235     870  2606
#> VVS1   17  186       789     616  2047
#> IF      9   71       268     230  1212

^{Created on 2022-05-30 by the reprex package (v2.0.1)}

CodePudding user response：

An example with mtcars, c("vs","am","gear") (your x's) vs "carb" (your y):

do.call(
  rbind,
  sapply(
    c("vs","am","gear"),
    function(x){
      as.data.frame(table(mtcars[,x],mtcars$carb))
    },
    simplify=F
  )
)

        Var1 Var2 Freq
vs.1       0    1    0
vs.2       1    1    7
vs.3       0    2    5
vs.4       1    2    5
vs.5       0    3    3
vs.6       1    3    0
vs.7       0    4    8
vs.8       1    4    2
...

var1 is the value of to variable in the row names, var2 is the value of y.