Home > Software engineering >  Automate univariate regressions for many outcomes in tbl_uvregression (gtsummary), return formatted
Automate univariate regressions for many outcomes in tbl_uvregression (gtsummary), return formatted

Time:04-13

I would like to use tbl_uvregression function (gtsummary package, R) because it can create univariate regression models holding either a covariate or outcome constant.

In my case, For each outcome, I need one nicely formatted table of univariate regression results containing every variable in the dataframe, except the outcome variable.This works fine if I subset my dataframe to contain only one outcome and the covariates of interest, before passing it to tbl_uvregression function.

However, I need help to figure out how to automate this process as I have many outcome variables and for each outcome variable, I want to produce one table of univariate regression using the same set of covariates - but not include the other outcome variables - and also label the tables so as to keep track of which table belongs to which outcome variable.

How do I do this?

# Libraries
library(gtsummary)
library(tidyverse)

# Data as well as a few artificial variables
data("iris")
my_iris <- as.data.frame(iris)

my_iris$out1 <- sample(c(0,1), 150, replace = TRUE)
my_iris$out2 <- sample(c(0,1), 150, replace = TRUE)
my_iris$out3 <- sample(c(0,1), 150, replace = TRUE)

# Extra variables below to simulate that the dataframe has extra covariates, 
# hence need to select those of interest.
my_iris$x1 <- sample(c(1:12), 150, replace = TRUE)
my_iris$x2 <- sample(c(50:100), 150, replace = TRUE)
my_iris$x3 <- sample(c(18:100), 150, replace = TRUE)


# List of outcome(*outcome*) and predictor(*preds*) variables I need to run univariate logistic regressions for.
outcome <- c("out1", "out2", "out3") # have a long list, but this is sufficient for demo
preds <- c("Species", "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width") # same here


# To produce a nicely formatted table for a single outcome I can do:
my_iris %>% 
    dplyr::select(outcome[1], all_of(preds)) %>% 
    tbl_uvregression(method = glm,
                     y = outcome[1],
                     method.args = list(family = binomial),
                     exponentiate = TRUE) %>%
    bold_labels() %>% modify_caption(paste("Univariate Regression Model with", outcome[1], "as Outcome", sep = " "))

# How to automate production of above table for multiple outcomes?


CodePudding user response:

I would use lapply to loop through the outcomes like this:

library(gtsummary)
library(tidyverse)

# Data as well as a few artificial variables
data("iris")
my_iris <- as.data.frame(iris)

my_iris$out1 <- sample(c(0,1), 150, replace = TRUE)
my_iris$out2 <- sample(c(0,1), 150, replace = TRUE)
my_iris$out3 <- sample(c(0,1), 150, replace = TRUE)

# Extra variables below to simulate that the dataframe has extra covariates, 
# hence need to select those of interest.
my_iris$x1 <- sample(c(1:12), 150, replace = TRUE)
my_iris$x2 <- sample(c(50:100), 150, replace = TRUE)
my_iris$x3 <- sample(c(18:100), 150, replace = TRUE)


# List of outcome(*outcome*) and predictor(*preds*) variables I need to run univariate logistic regressions for.
outcome <- c("out1", "out2", "out3") # have a long list, but this is sufficient for demo
preds <- c("Species", "Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width") # same here


# To produce a nicely formatted table for a single outcome I can do:
lapply(outcome, function(x){ 
my_iris %>% 
  dplyr::select(!!x, all_of(preds)) %>% 
  tbl_uvregression(method = glm,
                   y = !!x,
                   method.args = list(family = binomial),
                   exponentiate = TRUE) %>%
  bold_labels() %>% modify_caption(paste("Univariate Regression Model with", x, "as Outcome", sep = " "))
})
  • Related