I'm trying to create a series of frequency tables for several categorial variables using the freq_table() function. I've seen plenty of posts about this using similar table functions but can't get them to work with freq_table(), which I'm specifically interested in because it automatically generates confidence intervals.
This is the long version of what I'm trying to accomplish:
data(mtcars)
freq_table_am <- freq_table(mtcars, am, percent_ci = 95, ci_type = "logit", drop = FALSE)
freq_table_gear <- freq_table(mtcars, gear, percent_ci = 95, ci_type = "logit", drop = FALSE)
freq_tables <- rbind(freq_table_am, freq_table_gear)
freq_tables
I've tried using both lapply and for loops to accomplish this, with no luck so far. As an example of my attempt with the for loop:
vars<- c('am', 'gear')
#Attempt 1 with a for loop - produces an error message about "[" being an unexpected symbol
for(i in vars) {
freq_table(community_survey, [i], percent_ci = 95, ci_type = "logit", drop = FALSE)
}
#Attempt 2 with a for loop - produces an error message about column "i" not being found
for(i in vars) {
freq_table(community_survey, i, percent_ci = 95, ci_type = "logit", drop = FALSE)
}
Is there a way to do what I'm hoping to do, using either lapply, for loops, or another method? Or is there an alternative function I can use that will work better than freq_table() but still produce the confidence intervals I'm looking for?
Thank you!
CodePudding user response:
It does not work because the function freqtables::freq_table
calls the dplyr function count
under the hood, and this uses tidy evaluation to select grouping variables (in this case "gear" and "am").
Solution is mentioned on this page:
If you have the column name as a character vector, use the
.data
pronoun, e.g.summarise(df, mean = mean(.data[[var]]))
.
See below for a solution with base R lapply:
library(freqtables)
table_list <- lapply(vars, function(x){
freq_table(mtcars, .data[[x]], percent_ci = 95, ci_type = "logit", drop = FALSE)}
)
do.call("rbind", table_list)
CodePudding user response:
We can use purrr::map()
and input the variable as strings and use !! rlang::sym(.x)
to evaluate them in the freq_table()
function, no loop needed.
library(freqtables)
library(purrr)
map_dfr(c("am", "gear"),
~ freq_table(mtcars, !! rlang::sym(.x), percent_ci = 95, ci_type = "logit", drop = FALSE))
#> var cat n n_total percent se t_crit lcl ucl
#> 1 am 0 19 32 59.375 8.820997 2.039513 40.942255 75.49765
#> 2 am 1 13 32 40.625 8.820997 2.039513 24.502354 59.05775
#> 3 gear 3 15 32 46.875 8.962708 2.039513 29.750378 64.76868
#> 4 gear 4 12 32 37.500 8.695104 2.039513 21.969117 56.11463
#> 5 gear 5 5 32 15.625 6.521328 2.039513 6.325398 33.68097
We also could use base R eval(bquote())
to evaluate as.name(i)
before evaluating the whole call to freq_table()
:
library(freqtables)
vars<- c('am', 'gear')
out <- vector("list", 0L)
for (i in vars) {
out[[i]] <- eval(bquote(freq_table(mtcars, .(as.name(i)), percent_ci = 95, ci_type = "logit", drop = FALSE)))
}
dplyr::bind_rows(out)
#> var cat n n_total percent se t_crit lcl ucl
#> 1 am 0 19 32 59.375 8.820997 2.039513 40.942255 75.49765
#> 2 am 1 13 32 40.625 8.820997 2.039513 24.502354 59.05775
#> 3 gear 3 15 32 46.875 8.962708 2.039513 29.750378 64.76868
#> 4 gear 4 12 32 37.500 8.695104 2.039513 21.969117 56.11463
#> 5 gear 5 5 32 15.625 6.521328 2.039513 6.325398 33.68097
Created on 2022-09-30 by the reprex package (v0.3.0)