Using a loop or lapply on the freq_table() function in R-CodePudding

I'm trying to create a series of frequency tables for several categorial variables using the freq_table() function. I've seen plenty of posts about this using similar table functions but can't get them to work with freq_table(), which I'm specifically interested in because it automatically generates confidence intervals.

This is the long version of what I'm trying to accomplish:

data(mtcars)
freq_table_am <- freq_table(mtcars, am, percent_ci = 95, ci_type = "logit", drop = FALSE)
freq_table_gear <- freq_table(mtcars, gear, percent_ci = 95, ci_type = "logit", drop = FALSE)
freq_tables <- rbind(freq_table_am, freq_table_gear)
freq_tables

I've tried using both lapply and for loops to accomplish this, with no luck so far. As an example of my attempt with the for loop:

vars<- c('am', 'gear')

#Attempt 1 with a for loop - produces an error message about "[" being an unexpected symbol
for(i in vars) {
      freq_table(community_survey, [i], percent_ci = 95, ci_type = "logit", drop = FALSE) 
    }

#Attempt 2 with a for loop - produces an error message about column "i" not being found
for(i in vars) {
      freq_table(community_survey, i, percent_ci = 95, ci_type = "logit", drop = FALSE) 
    }

Is there a way to do what I'm hoping to do, using either lapply, for loops, or another method? Or is there an alternative function I can use that will work better than freq_table() but still produce the confidence intervals I'm looking for?

Thank you!

CodePudding user response：

It does not work because the function freqtables::freq_table calls the dplyr function count under the hood, and this uses tidy evaluation to select grouping variables (in this case "gear" and "am").

Solution is mentioned on this page:

If you have the column name as a character vector, use the .data pronoun, e.g. summarise(df, mean = mean(.data[[var]])).

See below for a solution with base R lapply:

library(freqtables)
table_list <- lapply(vars, function(x){
  freq_table(mtcars, .data[[x]], percent_ci = 95, ci_type = "logit", drop = FALSE)}
  )
do.call("rbind", table_list)

CodePudding user response：

We can use purrr::map() and input the variable as strings and use !! rlang::sym(.x) to evaluate them in the freq_table() function, no loop needed.

library(freqtables)
library(purrr)

map_dfr(c("am", "gear"),
        ~ freq_table(mtcars, !! rlang::sym(.x), percent_ci = 95, ci_type = "logit", drop = FALSE))
#>    var cat  n n_total percent       se   t_crit       lcl      ucl
#> 1   am   0 19      32  59.375 8.820997 2.039513 40.942255 75.49765
#> 2   am   1 13      32  40.625 8.820997 2.039513 24.502354 59.05775
#> 3 gear   3 15      32  46.875 8.962708 2.039513 29.750378 64.76868
#> 4 gear   4 12      32  37.500 8.695104 2.039513 21.969117 56.11463
#> 5 gear   5  5      32  15.625 6.521328 2.039513  6.325398 33.68097

We also could use base R eval(bquote()) to evaluate as.name(i) before evaluating the whole call to freq_table():

library(freqtables)

vars<- c('am', 'gear')

out <- vector("list", 0L)

for (i in vars) {
  out[[i]] <- eval(bquote(freq_table(mtcars, .(as.name(i)), percent_ci = 95, ci_type = "logit", drop = FALSE)))
}

dplyr::bind_rows(out)
#>    var cat  n n_total percent       se   t_crit       lcl      ucl
#> 1   am   0 19      32  59.375 8.820997 2.039513 40.942255 75.49765
#> 2   am   1 13      32  40.625 8.820997 2.039513 24.502354 59.05775
#> 3 gear   3 15      32  46.875 8.962708 2.039513 29.750378 64.76868
#> 4 gear   4 12      32  37.500 8.695104 2.039513 21.969117 56.11463
#> 5 gear   5  5      32  15.625 6.521328 2.039513  6.325398 33.68097

^{Created on 2022-09-30 by the reprex package (v0.3.0)}