Home > OS >  Running Levene's test for each column of a df in R
Running Levene's test for each column of a df in R

Time:06-11

I have a data frame containing scores of several sub-scales of the same test (columns: participant, session, group, total score, one column per sub-scale). I am trying to run assumption checks for a two-way mixed ANOVA for each sub-scale. For convenience, I would like to write one loop per assumption check, that gives me the output for all sub-scales. This worked well for checking outliers, running Box's M test and for generating the actual ANOVA output. However, I get an error when trying the same thing with Levene's test. See code and errors below:

subscales <- c("awareness", "clarity", "impulse", "goals", "nonacceptance", 
               "strategies") # these correspond to the column names in the df
for (scale in subscales) {
  ders %>%
  group_by(session) %>%
  levene_test(scale ~ group) %>%
  kable(caption = scale) %>% print()
}

Error in mutate(., data = map(.data$data, .f, ...)) : Caused by error in model.frame.default(): ! variable lengths differ (found for 'group')

How can I run Levene's test for all columns in my df without just repeating the same code over and over? I'm new to R, so maybe I'm trying in a too pythonist kind of way and should use something like lapply() instead?

CodePudding user response:

Create the formula with reformulate as the scale will be quoted string and thus, it needs the formula to be constructed either with reformulate or paste

for (scale in subscales) {
  ders %>%
  group_by(session) %>%
  levene_test(reformulate('group', response = scale)) %>%
  kable(caption = scale) %>% print()
}

This maybe also done with across

library(dplyr)
library(stringr)
library(tidyr)
library(rstatix)
data(mtcars)
mtcars %>%
   mutate(carb = factor(carb)) %>%
   group_by(cyl) %>%
   summarise(across(c(mpg, disp), 
    ~ levene_test(cur_data(),
        reformulate('carb', response = cur_column())) %>% 
           rename_with(~ str_c(cur_column(), .x), everything()) )) %>% 
   unpack(where(is.tibble))

-output

# A tibble: 3 × 9
    cyl mpgdf1 mpgdf2 mpgstatistic  mpgp dispdf1 dispdf2 dispstatistic    dispp
  <dbl>  <int>  <int>        <dbl> <dbl>   <int>   <int>         <dbl>    <dbl>
1     4      1      9        0.975 0.349       1       9      1.32e- 1 7.24e- 1
2     6      2      4        2.52  0.196       2       4      7.44e 29 7.23e-60
3     8      3     10        1.60  0.251       3      10      1.18e  1 1.27e- 3
  • Related