I've encountered challenges when using a glm and lm in R that they don't accept range of variables.
For example my code is this
model1 = glm(RESPONSE~., train.df, family= "binomial")
summary(model1)
RESPNSE is a variable name on the dataset. I am aware that the dot"." means all variables.
Is there away that I can tell R to take a range of variables, for example from 1 to N? N = integer numbers
The reason asking this question is because I need to include many, it takes a lot of time to type all of them with no mistakes.
Your help is greatly appreciated.
CodePudding user response:
Using the built in anscombe data frame this regresses y1 on the range from x3 to x4. The variable names can be replaced with column numbers or even a mix of variable names and column numbers.
lm(y1 ~ ., subset(anscombe, select = c(y1, x3:x4)))
lm(y1 ~ ., subset(anscombe, select = c(5, 3:4)))
We can alternately express this using pipes:
anscombe |>
subset(select = c(y1, x3:x4)) |>
lm(y1 ~ ., data = _)
Using dplyr it is nearly the same:
library(dplyr)
anscombe %>%
select(y1, x3:x4) %>%
lm(y1 ~ ., data = .)