Home > OS >  Receiving an error when using aov() inside a function
Receiving an error when using aov() inside a function

Time:07-21

I am trying to use the aov() function inside a function but R keeps giving me the same error.

Code:

dat$X1 = rep(c("a", "b"), 2)
dat$X2 = c(1,2,3,4)

f = function (x){
  aov(x ~ X1 , data = dat)
}
f('X2')

This gives me the following error:

Error in model.frame.default(formula = x ~ X1, data = dat, drop.unused.levels = TRUE) : 
variable lengths differ (found for 'X1')

The aov() works when I try to replace 'x' with the actual name of the variable (X2) so it doesn't make sense that the variable lengths would differ.

I have looked for this error everywhere but so far I haven't had luck finding the same error anywhere else.

I'm pretty sure that I am overlooking something very obvious but I've been stuck with this for a while.

Looking forward to reading your advise. Thanks.

CodePudding user response:

If you want to use function, and aov inside it, you may try

dat <- data.frame(X1 = rep(c("a", "b"), 2), X2 = c(1,2,3,4))
f = function (x){
  ff <-as.formula(paste0(x, "~ X1")) 
  aov(ff , data = dat)
}
f('X2')

Call:
   aov(formula = ff, data = dat)

Terms:
                X1 Residuals
Sum of Squares   1         4
Deg. of Freedom  1         2

Residual standard error: 1.414214
Estimated effects may be unbalanced

CodePudding user response:

I advice against defining a function the way you do. Your function has two key flaws: (1) It depends on a global variable (never good). (2) You don't check whether any of the variables (one being hard-coded, the other being a user input, which is awkward in itself) in your formula exist in your (global) data.frame.

Here is a better approach:

better_f <- function(data, dep_var, indep_var) {
    stopifnot(all(c(dep_var, indep_var) %in% names(data)))
    aov(reformulate(indep_var, response = dep_var), data = data)
}

# Sample data
dat <- data.frame(X1 = rep(c("a", "b"), 2), X2 = 1:4)

# Test function
better_f(dat, "X2", "X1")
#Call:
#    aov(formula = reformulate(indep_var, response = dep_var), data = data)
#
#Terms:
#    X1 Residuals
#Sum of Squares   1         4
#Deg. of Freedom  1         2
#
#Residual standard error: 1.414214
#Estimated effects may be unbalanced

CodePudding user response:

You write a function and there exists one parameter in this function. So when you want to deploy the function, all you need to do is give it a parameter. So there is no need to add a quotation marks. Drop it, and it will work.

'''
X1 = rep(c("a", "b"), 2)
X2 = c(1,2,3,4)
dat <- as.data.frame(cbind(X1, X2))

f = function (x){
    aov(x ~ X1 , data = dat)
}
f(X2)  
#f(x=X2) this works, too.
'''
  • Related