Home > Mobile >  loop through gtsummary table to pick out only significant variables
loop through gtsummary table to pick out only significant variables

Time:11-09

I have a question. I am, relatively new to R. I am transitioning some code from another app to R. In that code, I was able to loop through a table and pick out only the significant variables based on the p-value and the size of the odds ratio for logistic regression. Then I was able to say something like "x had a significant link with y" when the p was less than or equal to 0.05 and the odds ratio as above 1.00 and do the converse "x had a significant negative link with " when the p value was less than 0.05 and the odds ration was below 1.00. Then, I was able to do what I understand from the gtsummary literature is inline_text these statements. As I am trying to get my bearings with R, I was wondering how I would I accomplish this with gtsummary tables? My reproducible code does not work, but it is below:

# install.packages("gtsummary")
library(gtsummary)
library(tidyverse)

#simulated data
gender <- sample(c(0,1), size = 1000, replace = TRUE)
age <- round(runif(1000, 18, 80))
xb <- -9   3.5*gender   0.2*age
p <- 1/(1   exp(-xb))
y <- rbinom(n = 1000, size = 1, prob = p)
mod <- glm(y ~ gender   age, family = "binomial")
summary(mod)

#create the gtsummary table
tab1 = mod %>%
  tbl_regression(exponentiate = TRUE) %>%
  as_gt() %>%
  gt::tab_source_note(gt::md("*This data is simulated*")) 

#attempt of going through the gtsummary table
for (i in 1:nrow(tab1[1:3,])) {  # does one row at a time
  pv = tab1[["_data"]]$p.value
  num = tab1[i, "pv"]
  name = tab1[i, "variable"]
  if(pv <=0.05 ){
    cat("The link between", name, "and is significant. ")
  }
}

I ask about the gtsummary regression table because, I will have to do the same thing with the tbl_summary as well. I thought I would begin with the regression version. The idea is to get the gorgeous inline_text via an if else. All of this is triggered by the going down the p-value column, and then pulling the name of the variable and the amazing inline_text information into the sentence. I have looked through the available questions others have asked, but I haven't found anything that gets to the heart of this. If I have missed it, please, point me in the correct direction.

CodePudding user response:

There is a data frame in every gtsummary table called x$table_body. I think it's easier to extract the information you need from there. Example below! (you could also wrap the last line in an inline_text() if that is better for you).

# install.packages("gtsummary")
library(gtsummary)
#> #BlackLivesMatter
library(tidyverse)

#simulated data
gender <- sample(c(0,1), size = 1000, replace = TRUE)
age <- round(runif(1000, 18, 80))
xb <- -9   3.5*gender   0.2*age
p <- 1/(1   exp(-xb))
y <- rbinom(n = 1000, size = 1, prob = p)
mod <- glm(y ~ gender   age, family = "binomial")

#create the gtsummary table
tab1 = mod %>% tbl_regression(exponentiate = TRUE) 

# extract the variable names and the pvalues
tab1$table_body %>%
  select(variable, p.value) %>%
  filter(p.value <= 0.05) %>% # only keep the sig pvalues
  deframe() %>%
  imap(~str_glue("The link between 'y' and {.y} is significant ({style_pvalue(.x, prepend_p = TRUE)})."))
#> $gender
#> The link between 'y' and gender is significant (p<0.001).
#> 
#> $age
#> The link between 'y' and age is significant (p<0.001).

Created on 2022-11-07 with reprex v2.0.2

  • Related