Home > Blockchain >  group by multiple variables without intersection
group by multiple variables without intersection

Time:01-18

I want to group_by multiple columns wihout intersection.

I am looking for the output below without having to replicate the code for both variables.

library(dplyr)    
> mtcars %>% 
        group_by(cyl) %>%
        summarise(mean(disp))
    # A tibble: 3 × 2
        cyl `mean(disp)`
      <dbl>        <dbl>
    1     4         105.
    2     6         183.
    3     8         353.
    > 
    > mtcars %>% 
        group_by(am) %>%
        summarise(mean(disp))
    # A tibble: 2 × 2
         am `mean(disp)`
      <dbl>        <dbl>
    1     0         290.
    2     1         144.

I am not looking for the code below since this gives the intersection between the variables:

> mtcars %>% 
    group_by(cyl, am) %>%
    summarise(mean(disp))
# A tibble: 6 × 3
# Groups:   cyl [3]
    cyl    am `mean(disp)`
  <dbl> <dbl>        <dbl>
1     4     0        136. 
2     4     1         93.6
3     6     0        205. 
4     6     1        155  
5     8     0        358. 
6     8     1        326 

Thanks a lot!

CodePudding user response:

An alternative would be a custom function:

my_func <- function(df, group){
  df %>% 
    group_by({{group}}) %>% 
    summarise(mean_disp = mean(disp))
}

my_func(mtcars, cyl)
my_func(mtcars, am)
    cyl mean_disp
  <dbl>     <dbl>
1     4      105.
2     6      183.
3     8      353.
> my_func(mtcars, am)
# A tibble: 2 × 2
     am mean_disp
  <dbl>     <dbl>
1     0      290.
2     1      144.

CodePudding user response:

Something like this?

library(tidyverse)

c("cyl", "am") %>% 
  
  map(~ mtcars %>% 
        group_by(!!sym(.x)) %>% 
        summarise(result = mean(disp)))

[[1]]
# A tibble: 3 x 2
    cyl result
  <dbl>  <dbl>
1     4   105.
2     6   183.
3     8   353.

[[2]]
# A tibble: 2 x 2
     am result
  <dbl>  <dbl>
1     0   290.
2     1   144.
  • Related