Home > other >  Add Percentage to Grouping Variable in gtsummary Package
Add Percentage to Grouping Variable in gtsummary Package

Time:12-25

I am in the process of creating gtsummary table based on the mortality status (the variable "fate") of the Bernard data included in the pubh package.

The issue I am facing is that I want to add the percentage of "Dead" and "Alive" next to their count. But since this is the grouping variable, I haven't been able to configure it

This is my sample code for the table:

library(pubh)
library(dplyr)
library(gtsummary)

data("Bernard")


na.omit(Bernard)  %>% select(fate, race, apache) %>%
  tbl_summary(by = fate,
              
     type =  list(race ~ "categorical", apache ~ "continuous"),
     statistic = list(all_continuous() ~ "{min}, {max}", all_categorical() ~ "{p}%"),
     digits = list(all_continuous() ~ 2, all_categorical() ~ 2),
     missing_text = "(Missing)" ) %>% 
                
     add_stat_label() %>%
     modify_header(label ~ "**Variable**") %>%
     modify_caption("**Table 1. Summary statistics by  Mortality Status**") %>%
     modify_spanning_header(c("stat_1", "stat_2") ~ "**Fate**") %>%
     bold_labels() %>%
     italicize_labels() %>%
     italicize_levels() 

And this is the output

Ideally, I would like to have the table show : Alive, N = 96 (67..%) Dead, N = 47 (32..%)

I have tried listing the "fate" variable as categorical and then providing the statistic for percentage:

               type =  list(c(race, **fate**) ~ "categorical", apache ~ "continuous"),
               
               statistic = list(all_continuous() ~ "{min}, {max}", all_categorical() ~ "{p}%", **fate ~ "{p}%"**),

This did not work

And I was also thinking that using mutate to create a new variable before using tbl_summary() would probably work, but I am curious if this can be configured explicitly within tbl_summary()

One could generalize the question to: whether summary statistics other than the number of observations can be shown for the grouping variable in gtsummary

CodePudding user response:

You can add the percentage to the header using the modify_header() function. Example below!

library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.6.3'

trial %>%
  tbl_summary(
    by = trt, 
    include = age
  ) %>%
  modify_header(all_stat_cols() ~ "**{level}**, N={n} ({style_percent(p)}%)") %>%
  as_kable() # convert to kable to display on stackoverflow
Characteristic Drug A, N=98 (49%) Drug B, N=102 (51%)
Age 46 (37, 59) 48 (39, 56)
Unknown 7 4

Created on 2022-12-24 with reprex v2.0.2

  • Related