How to use the names in the string vector in models? for example why cant I do this
loop_variables = c("Age", "BMI", "Height")
for (i in 1:length(loop_variables){
basic_logistic_model = glm(outcome~loop_variable[i], data=DB, family="binomial"
summary(basic_logistic_model)
}
I see alot of R users doing vectors with names of study variables then looping it what am I doing wrong?
CodePudding user response:
It is the formula that needs to be update. We may use paste
or reformulate
. In addition, it is better to have an object to store the output of summary especially a list
would suit.
summary_lst <- vector('list', length(loop_variables))
names(summary_lst) <- loop_variables
for (i in 1:length(loop_variables){
# convert the column to factor column
DB[[loop_variables[i]]] <- factor(DB[[loop_variables[i]]])
# create the formula
fmla <- reformulate(loop_variable[i], response = 'outcome')
basic_logistic_model = glm(fmla, data=DB, family="binomial")
# assign the summary output to the list element
summary_lst[[i]] <- summary(basic_logistic_model)
}
CodePudding user response:
In my view, data transformation part (e.g. converting variables into factors) and modelling part (e.g. using glm()
) ought to be separated and not to be mixed in the loop for the code readability and efficiency.
Here, I will show how to execute looping iterations using purrr::map()
, while the data to be analysed is transformed using dplyr::mutate()
beforehand.
Package loading
library(purrr) # for `map`, `set_names`
library(dplyr) # for `mutate`
Data transformation
Add new variables that was converted into factors using dummy coding
fct_ToothGrowth <- ToothGrowth |>
mutate(
fct_dose = dose |>
as.factor()
fct_len = len |>
## The numeric variable `len` is converted
## into a three-level factor
cut(3) |>
as.factor()
)
contrasts(fct_ToothGrowth$fct_dose)
contrasts(fct_ToothGrowth$fct_len)
Add new variables that was converted into factors using non-dummy coding
Sum contrast and forward difference coding are used here as examples.
fct_ToothGrowth <- ToothGrowth |>
mutate(
fct_dose = `contrasts<-`(
factor(
dose,
levels = c("0.5", "1", "2")
), ,
## sum contrast coding (as known as deviation coding)
contr.sum(3)
),
fct_len = `contrasts<-`(
factor(
cut(len, 3)
), ,
## Forward difference coding
MASS::contr.sdif(3)
)
)
contrasts(fct_ToothGrowth$fct_dose)
contrasts(fct_ToothGrowth$fct_len)
Looping glm()
explanatory_variables <- c("fct_len", "fct_dose", "len", "dose")
summaries <- map(
.x = explanatory_variables,
## "fct_len", "fct_dose", "len", and "dose" are replaced
## by the arguments specified in `.x`.
~ paste0("supp ~ ", .x) |>
## `supp ~ fct_len`, ..., `supp ~ dose` are inputted
## into the first argument of `glm()`, namely `formula` argument
glm(family = binomial, data = fct_ToothGrowth)
) |>
## set names to the returned sublists
set_names(nm = explanatory_variables)
summaries$fct_len
summaries$fct_dose
summaries$len
summaries$dose