Home > database >  Using dplyr in function: Error when passing variable name
Using dplyr in function: Error when passing variable name

Time:08-19

I am making a function to conduct a t-test (for multiple temperature variables called temp_).

I want to make a function that passes var_name . For example, in the case of passing dataframe, I made a function below and it was successfully executed:

balance_table <- function(df) {
  table_result <- df %>%
    rstatix::t_test(temp_ca ~ station) %>%
    rstatix::adjust_pvalue(method = "BH") %>%
    rstatix::add_significance()
  table_result
}
balance_table(df_weather)

However, when passing a variable instead of df below, I got the following error.

balance_table <- function(var_name) {
  table_result <- df_weather %>%
    # var_name <- enquo(var_name)
    rstatix::t_test(var_name ~ station) %>%
    # rstatix::t_test(!!var_name ~ station) %>%
    rstatix::adjust_pvalue(method = "BH") %>%
    rstatix::add_significance()
  table_result
}
balance_table(temp_ca)

Error in `vec_as_location2_result()`:
! Can't extract columns that don't exist.
✖ Column `as.name(comp_var)` doesn't exist.
Run `rlang::last_error()` to see where the error occurred.
Called from: signal_abort(cnd)

Finally, the simple case without making my function works well.

df_weather %>%
  rstatix::t_test(temp_ca ~ station) %>%
  rstatix::adjust_pvalue(method = "BH") %>%
  rstatix::add_significance()
# A tibble: 1 × 10
  .y.         group1 group2    n1    n2 statistic    df     p p.adj
  <chr>       <chr>  <chr>  <int> <int>     <dbl> <dbl> <dbl> <dbl>
1 temp_ca     T      C        124   124     0.648  237. 0.518 0.518
# … with 1 more variable: p.adj.signif <chr>

I tried many patterns such as written below, but again got errors. Would appreciate any suggestions or alternative/efficient code to conduct the t-test (ultimately, I want to make a balance table)

Error when using dplyr inside of a function

Using Variable names for dplyr inside function

CodePudding user response:

Since you did not provide df_weather I am using mtcars as an example

library(rstatix)
library(dplyr)

balance_table <- function(var_name) {
  mtcars %>%
    t_test(reformulate("cyl", var_name)) %>%
    adjust_pvalue(method = "BH") %>%
    add_significance()
}

balance_table("mpg")

# A tibble: 3 × 10
#  .y.   group1 group2    n1    n2 statistic    df          p      p.adj p.adj.signif
#  <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>      <dbl>      <dbl> <chr>       
#1 mpg   4      6         11     7      4.72  13.0 0.000405   0.000405   ***         
#2 mpg   4      8         11    14      7.60  15.0 0.00000164 0.00000492 ****        
#3 mpg   6      8          7    14      5.29  18.5 0.0000454  0.0000681  ****        

The result is same as when you do -

mtcars %>%
  t_test(mpg ~ cyl) %>%
  adjust_pvalue(method = "BH") %>%
  add_significance()

# .y.   group1 group2    n1    n2 statistic    df          p      p.adj p.adj.signif
#  <chr> <chr>  <chr>  <int> <int>     <dbl> <dbl>      <dbl>      <dbl> <chr>       
#1 mpg   4      6         11     7      4.72  13.0 0.000405   0.000405   ***         
#2 mpg   4      8         11    14      7.60  15.0 0.00000164 0.00000492 ****        
#3 mpg   6      8          7    14      5.29  18.5 0.0000454  0.0000681  ****        

CodePudding user response:

Try this by wrapping the var_name in double curly braces.

balance_table <- function(var_name) {
  table_result <- df_weather %>%
    rstatix::t_test({{var_name}} ~ station) %>%
    rstatix::adjust_pvalue(method = "BH") %>%
    rstatix::add_significance()
  table_result
}
  • Related