Home > OS >  Error in dplyr::select() while writing a function
Error in dplyr::select() while writing a function

Time:11-23

I am writing a function to conduct one way anova and was writing the below function

fun_aov <- function(sps, param){
  
  ths_data_aov <-  ths_data_aov |> filter(Species == sps) |> 
    select(Species, Treatment, param)
  
  n_one_way_aov <- aov( param ~ Treatment, data = ths_data_aov)
  h <- summary(n_one_way_aov)
  
  return(h)
}

for the data frame below

Species Treatment    num_roots_n lng_long_root_cm dia_long_root_mm
   <chr>   <chr>           <dbl>          <dbl>        <dbl>
  x1       t1                  4            7          0.6
  x1       t1                  4            7          0.6
  x1       t1                  4            7          0.6
  x1       t1                  4            7          0.6
  x1       t2                  4            8          0.7
  x1       t2                  3            6          0.8
  x1       t2                  4            8          0.9
  x1       t2                  5            7          0.3
  x1       t3                  8            8          0.5
  x1       t3                  3            5          0.7
  x1       t3                  4            6          0.3
  x1       t3                  5            5          0.7
  x1       t4                  6            4          0.7
  x1       t4                  9            3          0.8
  x1       t4                  9            3          0.8
  x1       t4                  9            3          0.8

but when I execute the function fun_aov("x1", lng_long_root_cm) an error shows up saying Error in `select()`: ! object 'lng_long_root_cm' not found how do I rectify it.

I am expecting the return of h which gives me an analysis wrt to particular Species and param

CodePudding user response:

The difficulty here is that your function uses two different functions which themselves rely on two different types of non-standard evaluation. If you use a dplyr verb such as select inside a function, and wish to pass an unquoted column name as an argument (such as param in your example), then you need to use the curly-curly operator, i.e. {{param}} inside select, otherwise it will look for an actual column called param inside your data frame, which of course doesn't exist.

Similarly, variable names inside a formula are not substituted for parameters passed into your function, so aov will be looking for a column called param inside your data frame. Here, you need to construct the formula with something like as.formula and pass it with do.call to get this working correctly:

library(dplyr)

fun_aov <- function(sps, param) {
  
  ths_data_aov <-  ths_data_aov %>% 
    filter(Species == sps) %>% 
    select(Species, Treatment, {{param}})
  
  aov_f <- as.formula(paste(deparse(substitute(param)), '~ Treatment'))
  n_one_way_aov <- do.call('aov', list(formula = aov_f, data  = ths_data_aov))
  
  summary(n_one_way_aov)
}

Now you have

fun_aov("x1", lng_long_root_cm)
#>             Df Sum Sq Mean Sq F value  Pr(>F)    
#> Treatment    3  40.25  13.417   16.95 0.00013 ***
#> Residuals   12   9.50   0.792                    
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Data in reproducible format

ths_data_aov <- structure(list(Species = c("x1", "x1", "x1", "x1", "x1", "x1", 
"x1", "x1", "x1", "x1", "x1", "x1", "x1", "x1", "x1", "x1"), 
    Treatment = c("t1", "t1", "t1", "t1", "t2", "t2", "t2", "t2", 
    "t3", "t3", "t3", "t3", "t4", "t4", "t4", "t4"), num_roots_n = c(4L, 
    4L, 4L, 4L, 4L, 3L, 4L, 5L, 8L, 3L, 4L, 5L, 6L, 9L, 9L, 9L
    ), lng_long_root_cm = c(7L, 7L, 7L, 7L, 8L, 6L, 8L, 7L, 8L, 
    5L, 6L, 5L, 4L, 3L, 3L, 3L), dia_long_root_mm = c(0.6, 0.6, 
    0.6, 0.6, 0.7, 0.8, 0.9, 0.3, 0.5, 0.7, 0.3, 0.7, 0.7, 0.8, 
    0.8, 0.8)), class = "data.frame", row.names = c(NA, -16L))

CodePudding user response:

When you use a variable on r function you need to specify in quote in the function call. You also need to add select(all_of(param)) to select all the values in the :

fun_aov <- function(sps, param){
  
  ths_data_aov <-  ths_data_aov |> filter(Species == sps) |> 
    select(Species, Treatment, all_of(param))
  
  n_one_way_aov <- aov( param ~ Treatment, data = ths_data_aov)
  h <- summary(n_one_way_aov)
  
  return(h)
}

fun_aov("x1", 'lng_long_root_cm')
  • Related