As the title suggests, I would like to put the frequency of each level in the x-axis ticks with their corresponding label. For example, in the subtitle I have the total frequency of Question_3l. On each x-axis tick, I would like it to say the label, then on another line n=2.
Here is some working code of what I have so far:
dt <- structure(list(Question1 = c("1", "5", "4", "4"),
Question2 = c("2", "5", "4", "4"),
Question3 = c("1", "3", "2", "4"),
Question1_3l = c("3", "3", "2", "1"),
Question2_3l = c("2", "3", "1", "2"),
Question3_3l = c("1", "3", "2", "1")),
class = c("data.frame"),
row.names = c(NA, -4L))
ggplot(dt, aes(as.factor(Question1_3l)), na.rm=T)
geom_bar(aes(y = (..count..)/sum(..count..)), fill = "#005A9B")
geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25)
scale_y_continuous(labels = scales::percent)
scale_x_discrete(breaks = c(1, 2, 3), labels = c("Dissatisfied","Neutral","Satisfied"))
#scale_y_continuous(breaks = seq(0, .6, .01), labels = scales::percent, limits = c(0, .1))
#ylim(0, .1)
scale_fill_brewer(palette = "Set2")
labs(title = "",
subtitle = paste("Number of observations: ", nrow(dt)),
x="",
y = "Disabled",
color = "Unemployed")
Now I just need to automate the frequencies of each level in the x-axis label, similar to what I did with the subtitle.
Is this possible?
CodePudding user response:
You might do this with dplyr
upstream:
dt %>%
add_count(Question1_3l) %>%
mutate(x_cat = paste0(
recode(Question1_3l,
"1" = "Dissatisfied",
"2" = "Neutral",
"3" = "Satisfied"),
"\n(", n, ")")) %>%
ggplot( aes(x_cat))
geom_bar(aes(y = (..count..)/sum(..count..)), fill = "#005A9B")
geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25)
scale_y_continuous(labels = scales::percent)
labs(subtitle = paste("Number of observations: ", nrow(dt)),
x= NULL,
y = "Disabled",
color = "Unemployed")
CodePudding user response:
Here is the same approach as Jon Spring with other utilities:
We basically do upfront:
- long format
- filter
- refactor
- plot
library(tidyverse)
dt %>%
pivot_longer(everything()) %>%
add_count(name, value) %>%
filter(name == "Question1_3l") %>%
mutate(value.factor = value %>%
factor() %>%
fct_recode("Dissatisfied" = "1", "Neutral" = "2", "Satisfied" = "3")) %>%
mutate(my_label = paste0(value.factor, " (n=", n, ")")) %>%
ggplot(aes(x = my_label, na.rm=T))
geom_bar(aes(y = (..count..)/sum(..count..)), fill = "#005A9B")
geom_text(aes(y = ((..count..)/sum(..count..)), label = scales::percent((..count..)/sum(..count..))), stat = "count", vjust = -0.25)
scale_y_continuous(labels = scales::percent)
scale_fill_brewer(palette = "Set2")
labs(title = "",
subtitle = paste("Number of observations: ", nrow(dt)),
x="",
y = "Disabled",
color = "Unemployed")
CodePudding user response:
If you do most of the calculations beforehand, the ggplot call becomes less complicated. See below
library(dplyr)
sumdat <- dt %>%
count(Question1_3l) %>%
mutate(
Question1_3l = factor(
Question1_3l,
levels = c(1,2,3),
labels = c("Dissatisfied","Neutral","Satisfied")
),
perc = n/sum(n),
perc_label = paste0(round(100*perc), "%"),
plotlabels = paste0(Question1_3l, " (", n, ")")
)
ggplot(sumdat, aes(x = plotlabels, y = perc, label = perc_label))
geom_col(fill = "#005A9B")
geom_text(vjust = -.25)
scale_y_continuous(labels = scales::percent)
labs(
subtitle = paste("Number of observations: ", nrow(dt)),
x= NULL,
y = "Disabled",
color = "Unemployed"
)