My data looks like this
a | b | c | d | e | f |
---|---|---|---|---|---|
High | Low | High | Medium | Low | High |
High | Medium | Low | High | Medium | Low |
High | Low | High | Medium | Low | High |
High | Medium | Low | High | High | Medium |
I am trying to create a stacked bar chart, where the x axis has each column, a, b, c etc. as each bar, and with the y axis as a count of the occurrence/frequency of high, medium, and low per column, with high, medium, and low split within each single bar/column (3 stacks per bar).
I can use the following code to produce a new dataframe for the count of occurrences in a single column and then plot that in a bar chart, but I'm struggling to combine this into a single stacked bar, then repeat for all the columns.
library(dplyr)
newdf = df %>% dplyr::count(a)
Ideally this would also be a horizontal chart but I can figure that out later.
CodePudding user response:
You need to pivot into long format, then count
unique pairings:
library(tidyverse)
df %>%
pivot_longer(everything()) %>%
count(name, value) %>%
ggplot(aes(name, n, fill = factor(value, c("High", "Medium", "Low"))))
geom_col(color = "gray50")
scale_fill_brewer(palette = "YlGnBu")
labs(x = "Column", y = "Count", fill = NULL)
theme_minimal(base_size = 16)
Data in reproducible format
df <- structure(list(a = c("High", "High", "High", "High"), b = c("Low",
"Medium", "Low", "Medium"), c = c("High", "Low", "High", "Low"
), d = c("Medium", "High", "Medium", "High"), e = c("Low", "Medium",
"Low", "High"), f = c("High", "Low", "High", "Medium")),
class = "data.frame", row.names = c(NA, -4L))
df
#> a b c d e f
#> 1 High Low High Medium Low High
#> 2 High Medium Low High Medium Low
#> 3 High Low High Medium Low High
#> 4 High Medium Low High High Medium
If you don't have factor
s yet, apply as.factor
beforehand,
dat[] <- lapply(dat, as.factor)
or to only temporarily use factors:
sapply(lapply(dat, as.factor), tabulate, nbins=3) |> barplot()
Data:
dat <- structure(list(a = structure(c(1L, 1L, 1L, 1L), levels = "High", class = "factor"),
b = structure(c(1L, 2L, 1L, 2L), levels = c("Low", "Medium"
), class = "factor"), c = structure(c(1L, 2L, 1L, 2L), levels = c("High",
"Low"), class = "factor"), d = structure(c(2L, 1L, 2L, 1L
), levels = c("High", "Medium"), class = "factor"), e = structure(c(2L,
3L, 2L, 1L), levels = c("High", "Low", "Medium"), class = "factor"),
f = structure(c(1L, 2L, 1L, 3L), levels = c("High", "Low",
"Medium"), class = "factor")), row.names = c(NA, -4L), class = "data.frame")