Home > Net >  Summarize multiple variables by group with Xtabs()
Summarize multiple variables by group with Xtabs()

Time:11-30

xtabs can make a summary table, combined with cbind to summarize over multiple variables, and grouped by the remaining variable:

df<-data.frame(publication_date=c("2015 Jul","2015 Jul","2015 Aug","2015 Aug"),
               Asym=c(3,5,1,2),
               Auth=c(5,7,2,3),
               Cert=c(1,2,3,4))

xtabs(cbind(Auth, Asym, Cert)~., data=df)

#publication_date Auth Asym Cert
#        2015 Aug    5    3    7
#        2015 Jul   12    8    3

Is there a way to programatically cbind all but one variable, specifically, not writing out all the variable names, (for example, if df has many more than 3 columns).

I tried

xtabs(cbind(df[2:4])~., data=df)
xtabs(cbind(names(df[2:4]))~., data=df)
#Error in ... variable lengths differ

CodePudding user response:

Create a formula object with paste/sprintf

xtabs(as.formula(sprintf("cbind(%s)~.", toString(names(df)[-1]))), data = df)

-output

publication_date Asym Auth Cert
        2015 Aug    3    5    7
        2015 Jul    8   12    3

Or as @G. Grothendieck mentioned, just a character string as formula is enough

CodePudding user response:

We could use lapply/tapply and then set the names of the dimnames.

tab <- do.call("cbind", lapply(df[-1], tapply, df[[1]], sum))
names(dimnames(tab)) <- c(names(df)[1], "")
class(tab) <- c("xtabs", "array") # optional

tab
## publication_date Asym Auth Cert
##         2015 Aug    3    5    7
##         2015 Jul    8   12    3
  •  Tags:  
  • r
  • Related