Home > Software design >  Format numbers in tbl_summary gtsummary r
Format numbers in tbl_summary gtsummary r

Time:06-16

Problem:

I want to format big numbers in a categorical tbl_summary to simplify zeros.

Reprex:

data<-
  data.frame(variable1 = rep(1:4, each = 10000)) %>%
  mutate(
    variable1 =
      case_when(
        variable1 %in% 1 ~ "Dog",
        variable1 %in% 2 ~ "Cat",
        variable1 %in% 3 ~ "Lion",
        variable1 %in% 4 ~ "Tiger"
      )
  )

data %>% 
  tbl_summary()

Output:

Characteristic N = 40,000
variable1
Cat 10,000 (25%)
Dog 10,000 (25%)
Lion 10,000 (25%)
Tiger 10,000 (25%)

Desired output:

Characteristic N = 40
variable1 Results in thousands
Cat 10 (25%)
Dog 10 (25%)
Lion 10 (25%)
Tiger 10 (25%)

Attempts:

As you can see, is a frequency table but with the big numbers smplified by thousands. I was trying to achieve my desired output with functions inside gtsummary like style_number but i keep recieving the error: Error in .x * scale : non-numeric argument to binary operator. The answer could also be in the r package scales (function: label_number).

Let me know if you have some insights. Thanks.

CodePudding user response:

In the digits= argument you can pass both integers AND styling functions. In the example below, I wrote a new function that scales large numbers to the nearest thousand.

Example!

library(gtsummary)
packageVersion("gtsummary")
#> [1] '1.6.0'

style_number_10K <- function(x) paste0(style_number(x, scale = 0.001), "K")
style_number_10K(10002)
#> [1] "10K"

data <-
  data.frame(variable1 = rep(1:4, each = 10000)) %>%
  mutate(
    variable1 =
      dplyr::case_when(
        variable1 %in% 1 ~ "Dog",
        variable1 %in% 2 ~ "Cat",
        variable1 %in% 3 ~ "Lion",
        variable1 %in% 4 ~ "Tiger"
      )
  )

tbl <-
  data %>% 
  tbl_summary(
    digits = all_categorical() ~ list(style_number_10K, 0)
  )

enter image description here Created on 2022-06-15 by the reprex package (v2.0.1)

CodePudding user response:

Try this

#all your code here

data %>% 
  tbl_summary() -> df

df$N <- gsub(",0 " , "" , df$N) # N here is the target column to format
  • Related