I have a dataset that looks roughly like this. And the code below produces roughly the syle of plot that I want, but the proportions are all wrong. If you look at the actual proportions contained in the third code chunk, third percent of yellow party voters 42% of yellow party voters responded "no" but the plot shows over 60% I think somehow my grouping calculations are messed up in the ggplot2 code. But I would like to figure out how to plot proportions correctly in this way, rather than going through the trouble of grouping, summarizing and mutating because I find that tedious.
#sample data
set.seed(42)
var1<-sample(c("yes", "no", "maybe"), size=1000, replace=T, prob=c(0.25, 0.4, 0.35))
var2<-sample(c("yes", "no", "maybe"), size=1000, replace=T, prob=c(0.5, 0.1, 0.4))
var3<-sample(c("yes", "no", "maybe"), size=1000, replace=T, prob=c(0.4, 0.2, 0.4))
party<-sample(c("red", "yellow", "green"), size=1000, replace=T, prob=c(0.5, 0.25, 0.25))
df<-data.frame(var1, var2, var3, party)
library(tidyverse)
#graph I would like
df %>%
pivot_longer(., -party) %>%
ggplot(., aes(y=name, fill=value, x=after_stat(prop), group=value))
geom_bar(position="dodge") facet_grid(~party)
#actual values
df %>%
pivot_longer(., -party) %>%
group_by(name, party, value) %>%
summarize(n=n()) %>%
mutate(pct=n/sum(n)) %>%
filter(party=="yellow"&name=="var1")
CodePudding user response:
One option would be GGally::stat_prop
which adds a by
aesthetic to specify the denominator used to compute the proportions.
Note: I added a geom_text
layer to add the proportions as labels.
library(GGally)
library(tidyverse)
df %>%
pivot_longer(., -party) %>%
ggplot(., aes(x = after_stat(prop), y = name,
fill = value, group = value,
by = interaction(name, party)))
geom_bar(position = "dodge", stat = "prop")
geom_text(position = position_dodge(width = .9), stat = "prop", hjust = 1)
facet_grid(~party)