I am trying to graph the proportion of people in Remission (which is binary 0/1) after treatment by year. I can find how to graph the count, but I would like the proportion as there are a different number of people each year.
My data look something like this:
Client_id | Year | Remission |
---|---|---|
2 | 2016 | 0 |
4 | 2017 | 1 |
7 | 2017 | 0 |
8 | 2016 | 1 |
12 | 2016 | 1 |
I would like to create a plot with Year on the x-axis and the proportion of those in remission on the y-axis. Ideally, I would be able to do this both using geom_bar and geom_line.
I have tried this code, but it gives a proportion of 1.00 for every year, which is not correct.
ggplot(data=df)
geom_bar(aes(x=Year,y=Remission),stat="identity",position="dodge")
I could calculate this manually for each year and create a table using Excel, but hoping for a way to complete it in ggplot2.
CodePudding user response:
You could use position = "fill"
in your geom_bar
and use fill = Remission
in your ggplot
aesthetics like this:
library(dplyr)
library(ggplot2)
df %>%
mutate(Year = as.character(Year),
Remission = as.factor(Remission)) %>%
ggplot(aes(x=Year, fill = Remission))
geom_bar(position = "fill")
labs(y = "Proportion")
Created on 2022-08-22 with reprex v2.0.2
Percentage scale
If you want a percentage scale, you can use the package scales
with function percent_format()
in scale_y_continuous
like this:
library(dplyr)
library(ggplot2)
library(scales)
df %>%
mutate(Year = as.character(Year),
Remission = as.factor(Remission)) %>%
ggplot(aes(x=Year, fill = Remission))
geom_bar(position = "fill")
scale_y_continuous(labels=percent_format())
labs(y = "Proportion")
Created on 2022-08-22 with reprex v2.0.2
Proportion with geom_line
You can do this by first calculating the proportion using count
and group_by
with a mutate
and plot the data like this:
library(dplyr)
library(ggplot2)
df %>%
mutate(Year = as.numeric(Year),
Remission = as.factor(Remission)) %>%
count(Year, Remission) %>%
group_by(Year) %>%
mutate(prop = n/sum(n)) %>%
ungroup() %>%
ggplot(aes(x=Year, y = prop, color = Remission))
geom_line()
scale_x_continuous(breaks = c(2016,2017))
labs(y = "Proportion")
Created on 2022-08-22 with reprex v2.0.2