Home > front end >  Loop to create crosstabs of columns using tidyr
Loop to create crosstabs of columns using tidyr

Time:09-28

I would like to use a loop to create crosstabs of one column with every other column in a df. I started with this code (substituting in the iris df), which works nicely for two variables:

iris <- iris
tbl <- iris %>% 
  tabyl(Species, Sepal.Length, show_missing_levels = FALSE, show_na = FALSE) %>%
  adorn_percentages("row") %>%
  adorn_pct_formatting(digits = 0) %>%
  adorn_ns() %>%
  adorn_title("combined") %>%
  knitr::kable()
print(tbl)

My df contains ~200 columns. I thought I would write a for loop to print a crosstab for one variable with each of the other variables. Here's what I tried:

cols <- c('Sepal.Length', 'Sepal.Width')
for (c in cols){
  tbl <- iris %>% 
    tabyl(Species, c, show_missing_levels = FALSE, show_na = FALSE) %>%
    adorn_percentages("row") %>%
    adorn_pct_formatting(digits = 0) %>%
    adorn_ns() %>%
    adorn_title("combined") %>%
    knitr::kable()
  print(tbl)
}

This returns Column `c` is not found.

This seems like it should be simple, but I can't figure it out. Thanks for any help.

CodePudding user response:

Change the c in your code to !!sym(c). I can't explain this non-standard tidyverse evaluation thingy, but in layman's terms, you want to access an object (i.e. "c") outside of your pipe (iris). That's why you need !!sym.

CodePudding user response:

You can use the .data pronoun when passing columns names as strings.

cols <- c('Sepal.Length', 'Sepal.Width')

for (col in cols){
  tbl <- iris %>% 
    tabyl(Species, .data[[col]],show_missing_levels = FALSE,show_na = FALSE) %>%
    adorn_percentages("row") %>%
    adorn_pct_formatting(digits = 0) %>%
    adorn_ns() %>%
    adorn_title("combined") %>%
    knitr::kable()
  print(tbl)
}
  • Related