I am trying to get the input variable names out of the model object returned by the lm() function. I tried to access the attribute 'variables' in under lm_obj$terms. However, the returned object is a 'language' type object rather than a regular vector of names. For example:
lm_obj = lm(y ~ x z z:x, data=df)
attr(lm_obj, 'variables')
> list(x, z)
What is a 'language' type? How to convert this 'language' type object to a regular vector like c('x', 'z')?
CodePudding user response:
You may get them out of the call,
fit <- lm(mpg ~ hp, mtcars)
head(all.vars(fit$call), -1)
# [1] "mpg" "hp"
or the names
of the model.frame
which is probably better.
names(model.frame(fit))
# [1] "mpg" "hp"
"language"
is the (storage) mode
or typeof
of the object just as "double"
, "integer"
or "list"
are. See ?mode
, for more explanation and nice examples. In the R language definition you find a detailed explanation—anyway a nice reading.
CodePudding user response:
In your object m_obj$terms
, it is formula and you can access each term of it using [[
extractor operator like
m_obj$terms[[1]]
#> `~` # formula symbol
if you want to get your input variables you can use
strsplit(as.character(lm_obj$terms[[3]])[2] , " \\ ")[[1]]
#> [1] "x" "z"
CodePudding user response:
You are on the correct track. "terms"
object is where you should look at. If you want to omit the response variable, you can use delete.response
.
all.vars(delete.response(terms(lm_obj)))
#[1] "x" "z"
I would also like to point you to
labels(terms(lm_obj))
#[1] "x" "z" "x:z"
which is sometimes more useful.
A reproducible example to complement your question
df <- data.frame(y = rnorm(20), x = rnorm(20), z = rnorm(20))
lm_obj <- lm(y ~ x z z:x, data = df)
Oh, let me explain why we should look at "terms" object. Try different answers here on the following model:
lmfit <- lm(y ~ poly(x) z I(z ^ 2) z:x, data = df, na.action = na.exclude)