I am trying to find a way to automatically adjust the model formula that R will use to fit any sort of model. Here is a simple example. In the code below I want to be able to choose if I want to include "a" and "b" into the model or not by providing "include.a/b". If I choose "TRUE" it should be included into the model formula, if not left out.
x=1:10
y=2:11
y[9] = y[9] 1
a = rep(3, times = 10)
a[7] = 7
b = c(3:10, 10, 10)
include.a = FALSE
include.b = TRUE
# to get the model y ~ x b
model = lm(y ~ x
if(include.b == TRUE){ b)}
)
I've been searching this website everywhere but cannot find any hints.
CodePudding user response:
One option would be to define a character vector with the desired covariate names then create a formula using as.formula()
then plug it in to lm()
:
# specify what you want to include
# both a and b
includes <- c("a","b")
# define formula
frmla <- as.formula(paste0("x ~ y",
ifelse(!is.null(includes),
paste0(" ", paste(includes, collapse = " ")),"")))
# > frmla
# x ~ y a b
# Run model
lm(frmla)
#Call:
#lm(formula = frmla)
#Coefficients:
#(Intercept) y a b
# -1.250e 00 7.500e-01 8.885e-17 2.500e-01
Add as many as you like
includes <- c("a", "b", "c", "d", "f")
frmla <- as.formula(paste0("x ~ y", ifelse(!is.null(includes), paste0(" ",paste(includes, collapse = " ")),"")))
#> frmla
#x ~ y a b c d f
Or none at all:
includes <- c()
frmla <- as.formula(paste0("x ~ y", ifelse(!is.null(includes), paste0(" ",paste(includes, collapse = " ")),"")))
# > frmla
# x ~ y
CodePudding user response:
1) Use reformulate as shown:
fo <- reformulate(c("x", if (include.a) "a", if (include.b) "b"), "y")
lm(fo)
giving:
Call:
lm(formula = fo)
Coefficients:
(Intercept) x b
1.06154 1.10769 -0.07692
2) Alternately call lm like this:
do.call("lm", list(fo))
giving a nicer Call: line:
Call:
lm(formula = y ~ x b)
Coefficients:
(Intercept) x b
1.06154 1.10769 -0.07692
3) Also consider a design where a single character vector v of variable names is provided.
v <- "b"
fo <- reformulate(c("x", v), "y")
lm(fo)
v <- c("a", "b")
fo <- reformulate(c("x", v), "y")
lm(fo)
v <- c()
fo <- reformulate(c("x", v), "y")
lm(fo)
In a function it would be written like this:
my_lm <- function(v = c(), resp = "y", indep = "x", env = parent.frame()) {
fo <- reformulate(c(indep, v), resp, env = env)
do.call("lm", list(fo))
}
my_lm("b")