Home > Blockchain >  How to exclude percentages from Total column and row when using janitor::adorn_percentages()
How to exclude percentages from Total column and row when using janitor::adorn_percentages()

Time:02-06

Is there any way I can get the output below directly from adorn functions?

library(janitor)
library(stringr)

df <- mtcars %>%
  tabyl(am, cyl) %>%
  adorn_totals(c("row", "col")) %>%
  adorn_percentages("row") %>%
  adorn_pct_formatting(digits = 2) %>%
  adorn_ns(position = "front") 

df
#     am           4          6           8        Total
#      0  3 (15.79%) 4 (21.05%) 12 (63.16%) 19 (100.00%)
#      1  8 (61.54%) 3 (23.08%)  2 (15.38%) 13 (100.00%)
#  Total 11 (34.38%) 7 (21.88%) 14 (43.75%) 32 (100.00%)

df$Total <- str_replace(df$Total, " \\s*\\([^\\)] \\)", "")
df[df$am == "Total",] <- str_replace(df[df$am == "Total",], " \\s*\\([^\\)] \\)", "")

df
#     am           4          6           8 Total
#      0  3 (15.79%) 4 (21.05%) 12 (63.16%)    19
#      1  8 (61.54%) 3 (23.08%)  2 (15.38%)    13
#  Total          11          7          14    32

CodePudding user response:

This is a solution not only by janitor but in one run using dyplr and readr:

We add to your code one line with mutate(across... using a case_when conditional only on specific row and (the trick) using parse_number (that extracts automatically the first number), The second step is to use parse_number for the Total column:

library(janitor)
library(readr)
library(dplyr)
mtcars %>%
  tabyl(am, cyl) %>%
  adorn_totals(c("row", "col")) %>%
  adorn_percentages("row") %>% 
  adorn_pct_formatting(digits = 2) %>% 
  adorn_ns(position = "front") %>% 
  mutate(across(-c(am, Total), ~case_when(am == "Total" ~as.character(parse_number(.)),
                                          TRUE ~.))) %>% 
  mutate(Total = parse_number(Total)) 
    am           4          6           8 Total
     0  3 (15.79%) 4 (21.05%) 12 (63.16%)    19
     1  8 (61.54%) 3 (23.08%)  2 (15.38%)    13
 Total          11          7          14    32

CodePudding user response:

Essentially your problem is that you want to call adorn_totals() after you create the percentages. But you can't do that because then you're working with character columns with values like "3 (15.79%)", and you can't sum them.

I would just create a function to calculate the totals in one data frame and the percentages in the other and join them together:

library(dplyr)
library(janitor)

create_formatted_totals <- function(rows, cols, dat) {
    dat_pct <- dat |>
        tabyl({{ rows }}, {{ cols }}) |>
        adorn_percentages() |>
        adorn_pct_formatting() |>
        adorn_ns(position = "front")


    totals <- dat |>
        tabyl({{ rows }}, {{ cols }}) |>
        adorn_totals(c("row", "col")) |>
        mutate(across(everything(), as.character))

    # Add row totals
    dat_pct$Total <- head(totals$Total, -1)

    # Add col totals
    dat_pct <- rbind(dat_pct, tail(totals, 1))

    return(dat_pct)
}

You can then just do:

create_formatted_totals(am, cyl, mtcars)
#     am         4         6          8 Total
#      0 3 (15.8%) 4 (21.1%) 12 (63.2%)    19
#      1 8 (61.5%) 3 (23.1%)  2 (15.4%)    13
#  Total        11         7         14    32

CodePudding user response:

We could do the tidy-select options in some of the adorn functions

library(dplyr)
library(janitor)
mtcars %>%
  tabyl(am, cyl) %>%
  adorn_totals(c("row", "col")) %>%
  adorn_percentages("row", `...` = -c(am, Total)) %>%  
  adorn_pct_formatting(digits = 2, `...` = -c(am, Total)) %>% 
  adorn_ns(position = "front", `...` = -c(am, Total)) %>% 
  mutate(across(-c(am, Total), 
   ~ replace(.x, n(), readr::parse_number(.x[n()]))))

-output

    am           4          6           8 Total
     0  3 (15.79%) 4 (21.05%) 12 (63.16%)    19
     1  8 (61.54%) 3 (23.08%)  2 (15.38%)    13
 Total          11          7          14    32
  • Related