I am teaching myself to use tidyverse more, as I'm hoping to be able to make cleaner code in the future.
I have data that looks like this:
data <- as_tibble(data.frame(x = c(1,2,3,3,4),
y = c(3,4,4,2,5),
z = c(1,1,5,5,3)))
And I would like to get the mean, sd, and confidence intervals for all 3 columns.
The code I am hoping to use is this:
data %>%
summarize_at(vars(x:z), list(mean=mean, sd=sd, cilow = ci[2], cihigh = ci[3]))
where the ci() function is from the gmodels package. When passing a single variable through ci, you can pick which output column to view, but when it's part of a list of functions, I get the error
Error in ci[2] : object of type 'closure' is not subsettable
Any advice/suggestions are appreciated! I am trying not to manually calculate all the CIs (my actual data has many more variables to calculate)
CodePudding user response:
We can use lambda function. In addition, _at/_all
are deprecated in favor of across
library(dplyr)
library(gmodels)
data %>%
summarise(across(x:z, list(mean = ~ mean(.x, na.rm = TRUE),
sd = ~ sd(.x, na.rm = TRUE),
cilow = ~ ci(.x)[2], cihigh = ~ ci(.x)[3])))
-output
# A tibble: 1 × 12
x_mean x_sd x_cilow x_cihigh y_mean y_sd y_cilow y_cihigh z_mean z_sd z_cilow z_cihigh
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2.6 1.14 1.18 4.02 3.6 1.14 2.18 5.02 3 2 0.517 5.48
Or with summarise_at
data %>%
summarize_at(vars(x:z), list(mean=mean, sd=sd, cilow = ~ ci(.)[2], cihigh = ~ ci(.x)[3]))
# A tibble: 1 × 12
x_mean y_mean z_mean x_sd y_sd z_sd x_cilow y_cilow z_cilow x_cihigh y_cihigh z_cihigh
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 2.6 3.6 3 1.14 1.14 2 1.18 2.18 0.517 4.02 5.02 5.48