Home > Net >  Summarise with rowwise produces empty values
Summarise with rowwise produces empty values

Time:10-21

I have a dataframe with multiple columns and want to summarise rowwise by taking the mean on columns that start with a specific name. Therefore, this should summarise the columns and only return individual columns for each naming parameter.

For example:

iris %>% aggregate(. ~ Species, data=., sum) %>% group_by(Species) %>% mutate(summarise(across(starts_with(c('Sepal','Petal')), mean), .groups = "rowwise")) 

produces:

# A tibble: 3 × 6
# Groups:   Species [3]
  Species    Sepal.Length Sepal.Width Petal.Length Petal.Width               
  <fct>             <dbl>       <dbl>        <dbl>       <dbl> <rowwise_df[,0]>
1 setosa             250.        171.         73.1        12.3                 
2 versicolor         297.        138.        213          66.3                 
3 virginica          329.        149.        278.        101.                  

However, I was expecting a dataframe like the following:

 Species    Sepal   Petal                   
1 setosa     210.5  41.5 
..
..   

CodePudding user response:

The code is mixing tidyverse with base R. We may do this directly in tidyverse i.e. after grouping by 'Species', get the column wise sum with across, then get the rowMeans of the numeric columns

library(dplyr)
iris %>% 
  group_by(Species) %>%
  summarise(across(everything(), sum), .groups = 'drop') %>% 
  transmute(Species, Sepal = rowMeans(across(starts_with("Sepal"))),
      Petal = rowMeans(across(starts_with("Petal"))))

-output

# A tibble: 3 × 3
  Species    Sepal Petal
  <fct>      <dbl> <dbl>
1 setosa      211.  42.7
2 versicolor  218. 140. 
3 virginica   239. 189. 

If we want to use rowwise in groups (note that rowwise would be slower compared to vectorized rowMeans)

iris %>% 
  group_by(Species) %>%
  summarise(across(everything(), sum), .groups = 'rowwise') %>% 
  transmute(Sepal = mean(c_across(starts_with("Sepal"))), 
    Petal = mean(c_across(starts_with("Petal")))) %>%
  ungroup

-output

# A tibble: 3 × 3
  Species    Sepal Petal
  <fct>      <dbl> <dbl>
1 setosa      211.  42.7
2 versicolor  218. 140. 
3 virginica   239. 189. 
  •  Tags:  
  • r
  • Related