I'd like to fit a non-linear model using nls
, but I'm getting a non-numeric argument to binary operator
error:
df <- data.frame(task=rep(c("x", "y", "z"), each=6), time=rep(c(1:18)), y=c(1, 5, 9, 14, 17, 29, 2, 3, 1, 5, 2, 3, 29, 21, 18, 16, 14, 5))
OLS <- lm(y~ A time*task, data=df)
I'm trying to get a similar result with the nls
function:
NLS <- nls(y ~ A B*time*task, data=df, start=list(A=1,B=1))
Error in B * time * task : non-numeric argument to binary operator
It seems I'm not using nls
the way it should be used, so what is it that am I doing wrong?
CodePudding user response:
tl;dr because you can't multiply a factor variable (task
) by a numeric variable (time
).
If you want to use nls
to fit linear models (for whatever reason) you need to replicate the things that lm()
is doing under the hood. In particular, in R formula language, time*task
means "effect of time, effect of task, and their interaction". If both variables were numeric, then the interaction variable is just the regular product of the two variables, but if at least one is numeric, something more complicated happens. The simplest way to replicate is to use model.matrix()
:
X <- as.data.frame(model.matrix(~time*task, data =df))
names(X)
## "(Intercept)" "time" "tasky" "taskz" "time:tasky" "time:taskz"
X$y <- df$y ## add response variable (slight abuse of notation)
Because task
has three levels, we need a total of five derived variables (plus the intercept) to fit the full model.
NLS <- nls(y ~ A b1*time b2*tasky b3*taskz b4*`time:tasky` b5*`time:taskz`,
data=X, start=list(A=1,b1 = 0, b2 = 0, b3 = 0, b4 = 0, b5 = 0))
coef(NLS)
## A b1 b2 b3 b4 b5
## -5.600000 5.171429 6.638095 86.095238 -5.000000 -9.257143
Compare with OLS:
OLS <- lm(y~ 1 time*task, data=df) ## 'A' doesn't work
coef(OLS)
## (Intercept) time tasky taskz time:tasky time:taskz
## -5.600000 5.171429 6.638095 86.095238 -5.000000 -9.257143
You can also embed a linear submodel in a nonlinear model using the params
argument of nlme::gnls
:
GNLS <- nlme::gnls(y ~ mu, data = df,
params = list(mu ~ 1 time*task),
start = list(mu = rep(0,6)))
Same coefficients.