I am trying to use the aov() function inside a function but R keeps giving me the same error.
Code:
dat$X1 = rep(c("a", "b"), 2)
dat$X2 = c(1,2,3,4)
f = function (x){
aov(x ~ X1 , data = dat)
}
f('X2')
This gives me the following error:
Error in model.frame.default(formula = x ~ X1, data = dat, drop.unused.levels = TRUE) :
variable lengths differ (found for 'X1')
The aov() works when I try to replace 'x' with the actual name of the variable (X2) so it doesn't make sense that the variable lengths would differ.
I have looked for this error everywhere but so far I haven't had luck finding the same error anywhere else.
I'm pretty sure that I am overlooking something very obvious but I've been stuck with this for a while.
Looking forward to reading your advise. Thanks.
CodePudding user response:
If you want to use function, and aov
inside it, you may try
dat <- data.frame(X1 = rep(c("a", "b"), 2), X2 = c(1,2,3,4))
f = function (x){
ff <-as.formula(paste0(x, "~ X1"))
aov(ff , data = dat)
}
f('X2')
Call:
aov(formula = ff, data = dat)
Terms:
X1 Residuals
Sum of Squares 1 4
Deg. of Freedom 1 2
Residual standard error: 1.414214
Estimated effects may be unbalanced
CodePudding user response:
I advice against defining a function the way you do. Your function has two key flaws: (1) It depends on a global variable (never good). (2) You don't check whether any of the variables (one being hard-coded, the other being a user input, which is awkward in itself) in your formula exist in your (global) data.frame
.
Here is a better approach:
better_f <- function(data, dep_var, indep_var) {
stopifnot(all(c(dep_var, indep_var) %in% names(data)))
aov(reformulate(indep_var, response = dep_var), data = data)
}
# Sample data
dat <- data.frame(X1 = rep(c("a", "b"), 2), X2 = 1:4)
# Test function
better_f(dat, "X2", "X1")
#Call:
# aov(formula = reformulate(indep_var, response = dep_var), data = data)
#
#Terms:
# X1 Residuals
#Sum of Squares 1 4
#Deg. of Freedom 1 2
#
#Residual standard error: 1.414214
#Estimated effects may be unbalanced
CodePudding user response:
You write a function and there exists one parameter in this function. So when you want to deploy the function, all you need to do is give it a parameter. So there is no need to add a quotation marks. Drop it, and it will work.
'''
X1 = rep(c("a", "b"), 2)
X2 = c(1,2,3,4)
dat <- as.data.frame(cbind(X1, X2))
f = function (x){
aov(x ~ X1 , data = dat)
}
f(X2)
#f(x=X2) this works, too.
'''