I am trying to create a table with the frequency of multiple categorical variables. I have data like this:
id animal color shape
1 bear orange circle
2. dog blue triangle
3. dog yellow square
4. bear yellow square
5. cat yellow rectangle
I want to get a table like this:
Variable Level freq(n=5) percent
animal bear 2 40.0
dog 2 40.0
cat 1 20.0
color orange 1 20.0
blue 1 20.0
yellow 3 60.0
shape circle 1 20.0
triangle 1 20.0
square 2 40.0
rectangle 1 20.0
I can make frequency tables for them individually but was wondering if anyone knew a package or anything that can easily stack them like this. Thanks!
CodePudding user response:
Update: Better:
df1 <- df %>%
pivot_longer(
-id,
names_to = "Variable",
values_to = "Level"
) %>%
group_by(Variable, Level) %>%
summarise(freq = n()) %>%
mutate(percent = freq/sum(freq)*100) %>%
mutate(Variable = ifelse(duplicated(Variable), NA, Variable)) %>%
ungroup()
library(flextable)
flextable(df1)
library(dplyr)
library(tidyr)
df %>%
pivot_longer(
-id,
names_to = "Variable",
values_to = "Level"
) %>%
group_by(Variable, Level) %>%
summarise(freq = n()) %>%
mutate(percent = freq/sum(freq)*100)
Variable Level freq percent
<chr> <chr> <int> <dbl>
1 animal bear 2 40
2 animal cat 1 20
3 animal dog 2 40
4 color blue 1 20
5 color orange 1 20
6 color yellow 3 60
7 shape circle 1 20
8 shape rectangle 1 20
9 shape square 2 40
10 shape triangle 1 20
CodePudding user response:
One way with gtsummary
library(purrr)
library(gtsummary)
library(dplyr)
out <- map(c("{n}", "{p}.0"),
~df1 %>%
tbl_summary(
statistic = ~ .x,
include = -id
)
) %>%
tbl_merge() %>%
modify_spanning_header(everything() ~ NA) %>%
modify_footnote(everything() ~ NA) %>%
modify_header(list(var_label ~ "**Variable**",
label ~ "**Level**", stat_0_1 ~ "**freq(n = 5)**",
stat_0_2 ~ "**percent**"))
out[1]$table_body <- out[1]$table_body %>%
filter(duplicated(var_label)) %>%
mutate(var_label = replace(var_label, duplicated(var_label), ""))
-output
data
df1 <- structure(list(id = c(1, 2, 3, 4, 5), animal = c("bear", "dog",
"dog", "bear", "cat"), color = c("orange", "blue", "yellow",
"yellow", "yellow"), shape = c("circle", "triangle", "square",
"square", "rectangle")), class = "data.frame", row.names = c(NA,
-5L))