I'm looking to report the min, max, and mean of certain columns (price
, age
, and dist
)from the houses
data set using pipes in a concise tibble. For now, I have the following code which produces a rather inelegant solution with a 1x9 tibble:
houses %>%
select(price, age, dist) %>%
summarize_each(list(min = min, max = max, mean = mean))
I was hoping to create a more organized solution using pipes with the selected data as rows and the summary stats (min, max, mean) as columns resulting in a 3x3 tibble. Any ideas?
CodePudding user response:
You may first get the data in long format and then calculate summary statistics for each column. Here is an example with mtcars
dataset.
library(dplyr)
library(tidyr)
mtcars %>%
select(mpg, disp, cyl) %>%
pivot_longer(cols = everything()) %>%
group_by(name) %>%
summarise(min = min(value, na.rm = TRUE),
max = max(value, na.rm = TRUE),
mean = mean(value, na.rm = TRUE))
# name min max mean
# <chr> <dbl> <dbl> <dbl>
#1 cyl 4 8 6.19
#2 disp 71.1 472 231.
#3 mpg 10.4 33.9 20.1
CodePudding user response:
A possible solution to output a dataframe
:
library(dplyr)
houses %>%
summarise(across(c(price,age,dist),c(max,min,mean))) %>%
matrix(ncol = 3, byrow = T) %>%
as.data.frame() %>%
rename(Max=V1, Min=V2, Mean=V3)
A possible solution to output a tibble
:
library(dplyr)
houses %>%
summarise(across(c(price,age,dist),c(max,min,mean))) %>%
matrix(ncol = 3, byrow = T) %>%
tibble(Max=unlist(.[,1]),Min=unlist(.[,2]),Mean=unlist(.[,3])) %>%
select(Max,Min,Mean)