Home > database >  how to loop over string variable in R
how to loop over string variable in R

Time:08-04

How to use the names in the string vector in models? for example why cant I do this

loop_variables = c("Age", "BMI", "Height")

for (i in 1:length(loop_variables){
    basic_logistic_model = glm(outcome~loop_variable[i], data=DB, family="binomial"
    summary(basic_logistic_model)
}

I see alot of R users doing vectors with names of study variables then looping it what am I doing wrong?

CodePudding user response:

It is the formula that needs to be update. We may use paste or reformulate. In addition, it is better to have an object to store the output of summary especially a list would suit.

summary_lst <- vector('list', length(loop_variables))
names(summary_lst) <- loop_variables
for (i in 1:length(loop_variables){
    # convert the column to factor column
    DB[[loop_variables[i]]] <- factor(DB[[loop_variables[i]]])
    # create the formula
    fmla <- reformulate(loop_variable[i], response = 'outcome')
    basic_logistic_model = glm(fmla, data=DB, family="binomial")
    # assign the summary output to the list element
    summary_lst[[i]] <- summary(basic_logistic_model)
}

CodePudding user response:

In my view, data transformation part (e.g. converting variables into factors) and modelling part (e.g. using glm()) ought to be separated and not to be mixed in the loop for the code readability and efficiency.

Here, I will show how to execute looping iterations using purrr::map(), while the data to be analysed is transformed using dplyr::mutate() beforehand.

Package loading

library(purrr) # for `map`, `set_names`
library(dplyr) # for `mutate`

Data transformation

Add new variables that was converted into factors using dummy coding

fct_ToothGrowth <- ToothGrowth |>
  mutate(
    fct_dose = dose |>
      as.factor()
    fct_len = len |>
      ## The numeric variable `len` is converted
      ## into a three-level factor 
      cut(3) |>
      as.factor()
  )

contrasts(fct_ToothGrowth$fct_dose)
contrasts(fct_ToothGrowth$fct_len)

Add new variables that was converted into factors using non-dummy coding

Sum contrast and forward difference coding are used here as examples.

fct_ToothGrowth <- ToothGrowth |>
  mutate(
    fct_dose = `contrasts<-`(
      factor(
        dose,
        levels = c("0.5", "1", "2")
      ), ,
      ## sum contrast coding (as known as deviation coding)
      contr.sum(3)
    ),
    fct_len = `contrasts<-`(
      factor(
        cut(len, 3)
      ), ,
      ## Forward difference coding
      MASS::contr.sdif(3)
    )
  )

contrasts(fct_ToothGrowth$fct_dose)
contrasts(fct_ToothGrowth$fct_len)

Looping glm()

explanatory_variables <- c("fct_len", "fct_dose", "len", "dose")

summaries <- map(
  .x = explanatory_variables,
  ## "fct_len", "fct_dose", "len", and "dose" are replaced
  ## by the arguments specified in `.x`. 
  ~ paste0("supp ~ ", .x) |>
  ## `supp ~ fct_len`, ..., `supp ~ dose` are inputted
  ## into the first argument of `glm()`, namely `formula` argument 
    glm(family = binomial, data = fct_ToothGrowth)
) |>
  ## set names to the returned sublists
  set_names(nm = explanatory_variables)

summaries$fct_len
summaries$fct_dose
summaries$len
summaries$dose
  •  Tags:  
  • r
  • Related