Home > front end >  Is there a way to "summarize_by_group" without having to group_by the whole data each time
Is there a way to "summarize_by_group" without having to group_by the whole data each time


I have a data frame with numerous variables I can group by.

I write a new chunk every time:

df %>% group_by(variable) %>% summarize()

Yet when I make a boxplot, I do not have to do this. I can simply add the groups in the function:

boxplot(df$numericvariable ~ df$variable_I_want_to_group_by, data=df)  

This allows me in Rmarkdown to write all the different group_by's in the same chunk and view all the plots created next to each other.

I would like to find the same "group_by" as an integral part of a function for summarize (or an other function that does the same from a different package).

CodePudding user response:

You may use base R aggregate with a similar formula interface to boxplot,

aggregate(disp ~ cyl, mtcars, \(x) c(mean=mean(x), n=length(x)))
#   cyl disp.mean   disp.n
# 1   4  105.1364  11.0000
# 2   6  183.3143   7.0000
# 3   8  353.1000  14.0000

which will give you the same as dplyr.

mtcars %>%
  group_by(cyl) %>%
  summarise(mean = mean(disp), n =n())
# # A tibble: 3 × 3
#     cyl  mean     n
#   <dbl> <dbl> <int>
# 1     4  105.    11
# 2     6  183.     7
# 3     8  353.    14
  • Related