Home > Blockchain >  Merge multiple frequency tables together in R
Merge multiple frequency tables together in R

Time:12-09

I am trying to create a table with the frequency of multiple categorical variables. I have data like this:

id    animal    color     shape
1      bear     orange    circle
2.     dog      blue      triangle
3.     dog      yellow    square
4.     bear     yellow    square
5.     cat      yellow    rectangle

I want to get a table like this:

Variable     Level       freq(n=5)   percent

animal       bear         2           40.0
             dog          2           40.0
             cat          1           20.0


color       orange        1           20.0
            blue          1           20.0
            yellow        3           60.0

shape       circle        1           20.0
            triangle      1           20.0
            square        2           40.0
            rectangle     1           20.0

I can make frequency tables for them individually but was wondering if anyone knew a package or anything that can easily stack them like this. Thanks!

CodePudding user response:

Update: Better:

df1 <- df %>% 
  pivot_longer(
    -id,
    names_to = "Variable",
    values_to = "Level"
  ) %>% 
  group_by(Variable, Level) %>% 
  summarise(freq = n()) %>% 
  mutate(percent = freq/sum(freq)*100) %>% 
  mutate(Variable = ifelse(duplicated(Variable), NA, Variable)) %>% 
  ungroup()

library(flextable)
flextable(df1)

enter image description here

library(dplyr)
library(tidyr)

df %>% 
  pivot_longer(
    -id,
    names_to = "Variable",
    values_to = "Level"
  ) %>% 
  group_by(Variable, Level) %>% 
  summarise(freq = n()) %>% 
  mutate(percent = freq/sum(freq)*100) 
   Variable Level      freq percent
   <chr>    <chr>     <int>   <dbl>
 1 animal   bear          2      40
 2 animal   cat           1      20
 3 animal   dog           2      40
 4 color    blue          1      20
 5 color    orange        1      20
 6 color    yellow        3      60
 7 shape    circle        1      20
 8 shape    rectangle     1      20
 9 shape    square        2      40
10 shape    triangle      1      20

CodePudding user response:

One way with gtsummary

library(purrr)
library(gtsummary)
library(dplyr)
 out <-  map(c("{n}", "{p}.0"),
    ~df1 %>% 
      tbl_summary(
        statistic = ~ .x,
        include = -id
      ) 
  ) %>%
  tbl_merge() %>%
  modify_spanning_header(everything() ~ NA) %>%
  modify_footnote(everything() ~ NA) %>%
  modify_header(list(var_label ~ "**Variable**",
       label ~ "**Level**", stat_0_1 ~ "**freq(n = 5)**",
       stat_0_2 ~ "**percent**")) 
  
 out[1]$table_body <- out[1]$table_body %>% 
            filter(duplicated(var_label)) %>% 
            mutate(var_label = replace(var_label, duplicated(var_label), ""))

-output

enter image description here

data

df1 <- structure(list(id = c(1, 2, 3, 4, 5), animal = c("bear", "dog", 
"dog", "bear", "cat"), color = c("orange", "blue", "yellow", 
"yellow", "yellow"), shape = c("circle", "triangle", "square", 
"square", "rectangle")), class = "data.frame", row.names = c(NA, 
-5L))
  •  Tags:  
  • r
  • Related