I've tried looking at old threads unsuccessfully. I'm trying to plot the share of male legislators in different parliamentary sessions in a histogram.
This is my code, which works but shows the amount of legislators (NOT the share). How can I plot the share? Thanks!
mergedf %>%
ggplot( aes(x = session, fill = factor(sex)))
geom_histogram (binwidth = 0.5)
theme_minimal()
theme(legend.position ="bottom")
labs(title = "Share of male legislators by session", x= "Session", y = "Share of legislators",
fill ="sex")
Edit: I get the share of legislators with this table, but I don't know how to integrate it in the histogram.
mergedf %>%
tabyl (session, sex) %>%
adorn_percentages() %>%
adorn_pct_formatting ()
CodePudding user response:
One option would be to use some dplyr
verbs to compute the counts and percentages which could then be displayed as a barchart (a histogram is something different) via geom_col
like so:
mergedf <- data.frame(
sessions = c( 1, 2, 3, 4, 5, 2, 3, 4, 2),
sex = c ("female", "female", "female", "male", "female", "female", "female", "male", "male")
)
library(dplyr)
library(ggplot2)
mergedf %>%
group_by(sessions, sex) %>%
summarise(n = n()) %>%
mutate(pct = n / sum(n)) %>%
ggplot( aes(x = factor(sessions), y = pct, fill = sex))
geom_col(width = .6)
theme_minimal()
theme(legend.position ="bottom")
labs(title = "Share of male legislators by session", x= "Session", y = "Share of legislators",
fill ="sex")
#> `summarise()` has grouped output by 'sessions'. You can override using the
#> `.groups` argument.
CodePudding user response:
You simply need to specify position="fill"
in your geom_histogram
parameters:
library(ggplot2)
mergedf <- data.frame(
session = c( 1, 2, 3, 4, 5, 2, 3, 4, 2),
sex = c ("female", "female", "female", "male", "female", "female", "female", "male", "male")
)
ggplot(mergedf, aes(x = session, fill = factor(sex)))
geom_histogram (binwidth = 0.5, position = "fill") # <- HERE
theme_minimal()
theme(legend.position ="bottom")
labs(title = "Share of male legislators by session",
x= "Session", y = "Share of legislators", fill ="sex")
Technically, you're not really building a histogram (binned distribution of counts) but a barplot, so you could alternatively use the geom_bar
geom, with the same format:
ggplot(mergedf, aes(x = session, fill = factor(sex)))
geom_bar(position="fill")
theme_minimal ()
theme(legend.position ="bottom")
labs(title = "Share of male legislators by session",
x= "Session", y = "Share of legislators", fill ="sex")