I am writing a function to conduct one way anova and was writing the below function
fun_aov <- function(sps, param){
ths_data_aov <- ths_data_aov |> filter(Species == sps) |>
select(Species, Treatment, param)
n_one_way_aov <- aov( param ~ Treatment, data = ths_data_aov)
h <- summary(n_one_way_aov)
return(h)
}
for the data frame below
Species Treatment num_roots_n lng_long_root_cm dia_long_root_mm
<chr> <chr> <dbl> <dbl> <dbl>
x1 t1 4 7 0.6
x1 t1 4 7 0.6
x1 t1 4 7 0.6
x1 t1 4 7 0.6
x1 t2 4 8 0.7
x1 t2 3 6 0.8
x1 t2 4 8 0.9
x1 t2 5 7 0.3
x1 t3 8 8 0.5
x1 t3 3 5 0.7
x1 t3 4 6 0.3
x1 t3 5 5 0.7
x1 t4 6 4 0.7
x1 t4 9 3 0.8
x1 t4 9 3 0.8
x1 t4 9 3 0.8
but when I execute the function
fun_aov("x1", lng_long_root_cm)
an error shows up saying
Error in `select()`: ! object 'lng_long_root_cm' not found
how do I rectify it.
I am expecting the return of h
which gives me an analysis wrt to particular Species
and param
CodePudding user response:
The difficulty here is that your function uses two different functions which themselves rely on two different types of non-standard evaluation. If you use a dplyr
verb such as select
inside a function, and wish to pass an unquoted column name as an argument (such as param
in your example), then you need to use the curly-curly operator, i.e. {{param}}
inside select
, otherwise it will look for an actual column called param
inside your data frame, which of course doesn't exist.
Similarly, variable names inside a formula are not substituted for parameters passed into your function, so aov
will be looking for a column called param
inside your data frame. Here, you need to construct the formula with something like as.formula
and pass it with do.call
to get this working correctly:
library(dplyr)
fun_aov <- function(sps, param) {
ths_data_aov <- ths_data_aov %>%
filter(Species == sps) %>%
select(Species, Treatment, {{param}})
aov_f <- as.formula(paste(deparse(substitute(param)), '~ Treatment'))
n_one_way_aov <- do.call('aov', list(formula = aov_f, data = ths_data_aov))
summary(n_one_way_aov)
}
Now you have
fun_aov("x1", lng_long_root_cm)
#> Df Sum Sq Mean Sq F value Pr(>F)
#> Treatment 3 40.25 13.417 16.95 0.00013 ***
#> Residuals 12 9.50 0.792
#> ---
#> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Data in reproducible format
ths_data_aov <- structure(list(Species = c("x1", "x1", "x1", "x1", "x1", "x1",
"x1", "x1", "x1", "x1", "x1", "x1", "x1", "x1", "x1", "x1"),
Treatment = c("t1", "t1", "t1", "t1", "t2", "t2", "t2", "t2",
"t3", "t3", "t3", "t3", "t4", "t4", "t4", "t4"), num_roots_n = c(4L,
4L, 4L, 4L, 4L, 3L, 4L, 5L, 8L, 3L, 4L, 5L, 6L, 9L, 9L, 9L
), lng_long_root_cm = c(7L, 7L, 7L, 7L, 8L, 6L, 8L, 7L, 8L,
5L, 6L, 5L, 4L, 3L, 3L, 3L), dia_long_root_mm = c(0.6, 0.6,
0.6, 0.6, 0.7, 0.8, 0.9, 0.3, 0.5, 0.7, 0.3, 0.7, 0.7, 0.8,
0.8, 0.8)), class = "data.frame", row.names = c(NA, -16L))
CodePudding user response:
When you use a variable on r function you need to specify in quote in the function call. You also need to add select(all_of(param))
to select all the values in the :
fun_aov <- function(sps, param){
ths_data_aov <- ths_data_aov |> filter(Species == sps) |>
select(Species, Treatment, all_of(param))
n_one_way_aov <- aov( param ~ Treatment, data = ths_data_aov)
h <- summary(n_one_way_aov)
return(h)
}
fun_aov("x1", 'lng_long_root_cm')