I am trying to add the legends to the several geom_point. Although, as I have three geom_points the labels appear only for one variable ("Outcome").
Additionally to the "Outcome" variable, I want to show the labels for two diamonds: the blue diamond ("TStartTime") and the green ("indicator").
# Code for numbers reproduction
df <- data.frame(subjectID = factor(1:10, 10:1),
stage = rep(c("treated"), times = c(10)),
endTime = c(6, 8, 3, 5, 10, 14, 2, 12, 6, 6),
Outcome = rep(c("healthy", "disability", "healthy", "disability", NA, NA, NA, NA, "healthy", "disability"), 1),
TStartTime=c(1.0, 1.5, 0.3, 0.9, NA, NA, NA, NA, NA, NA),
TEndTime=c(6.0, 7.0, 1.2, 1.4, NA, NA, NA, NA, NA, NA),
TimeZero=c(0,0,0,0,0,0,0,0,0,0),
ind=rep(c(!0, !0, !0, !0, !0), times = c(2, 2, 2, 2, 2)),
Garea=c(1.0, 1.5, 0.3, 0.9, 2, 2, NA, NA, NA, NA),
indicator=c(NA, NA, NA, NA, 4, 1, 5, 2, NA, NA))
# Code for the plot
gg <- ggplot(df, aes(subjectID, endTime))
scale_fill_manual(values = c("khaki", "orange"))
geom_col(aes(fill = factor(stage)))
geom_point(data=df, aes(subjectID, TStartTime), colour = c("blue"), fill =alpha(c("#FAFAFA"), 0.2), shape=18, size=4)
coord_flip() # blue diamond
geom_point(data=df, aes(subjectID, indicator), colour = c("green"), shape=18, size=4)
coord_flip() # green diamond for indicator
geom_point(aes(colour = Outcome, shape = Outcome), size = 4)
coord_flip()
scale_colour_manual(values = c('purple','gray'), na.translate=FALSE)
scale_y_continuous(limits = c(-0.2, 15), breaks = 0:15)
labs(labels= "",
x = "ID ",
fill = "Status",
y = "Days",
title = "Plot")
theme_classic()
theme(plot.title = element_text(hjust = 0.5),
plot.caption = element_text(size = 7, hjust = 0))
CodePudding user response:
You're basically looking for a second scale for the same aesthetic. ggnewscale is your friend. Many other comments in the code. In particular, you've called coord_flip many times, this is not necessary and possibly even dangerous. I'd avoid coord_flip altogether (see my comments in the code how to do that).
All this technical aspect aside - your visualisation doesn't seem quite ideal to me, and rather confusing. I wonder if there might not be more intuitive ways to present your various variables - maybe consider facets. A suggestion below.
library(tidyverse)
library(ggnewscale)
df <- data.frame(
subjectID = factor(1:10, 10:1),
stage = rep(c("treated"), times = c(10)),
endTime = c(6, 8, 3, 5, 10, 14, 2, 12, 6, 6),
Outcome = rep(c("healthy", "disability", "healthy", "disability", NA, NA, NA, NA, "healthy", "disability"), 1),
TStartTime = c(1.0, 1.5, 0.3, 0.9, NA, NA, NA, NA, NA, NA),
TEndTime = c(6.0, 7.0, 1.2, 1.4, NA, NA, NA, NA, NA, NA),
TimeZero = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
ind = rep(c(!0, !0, !0, !0, !0), times = c(2, 2, 2, 2, 2)),
Garea = c(1.0, 1.5, 0.3, 0.9, 2, 2, NA, NA, NA, NA),
indicator = c(NA, NA, NA, NA, 4, 1, 5, 2, NA, NA)
)
# pivot longer so you can combine tstarttime and indicator into one legend easily
df %>%
pivot_longer(cols = c(TStartTime, indicator)) %>%
# remove all the coord_flip calls (you only need one, if not none!)
ggplot()
scale_fill_manual(values = c("khaki", "orange"))
# just change the x/y aesthetic in geom_col
# geom_col would add all values together, so you need to use the un-pivoted data
geom_col(data = df, mapping = aes(y = subjectID, x = endTime, fill = factor(stage)))
# now you only need one geom_point for the new scale, but use the variable in aes()
geom_point(aes(y = subjectID, x = value, colour = name), shape = 18, size = 4)
scale_color_manual(values = c("blue", "green"))
# now add a new scale for the same aesthetic (color)
new_scale_color()
geom_point(aes(y = subjectID, x = endTime, colour = Outcome, shape = Outcome), size = 4)
## removing na.translate = FALSE avoids the duplicate legend for outcome
scale_colour_manual(values = c("purple", "gray"))
#> Warning: Removed 12 rows containing missing values (geom_point).
#> Warning: Removed 8 rows containing missing values (geom_point).
Visualising less dimensions / variables is sometimes better. Here a suggestion how to avoid double scales for the same aesthetic and using your color maybe more convincingly. I feel the use of bars might also not be ideal, but this really depends on what the variable "indicator/ttimestart" is and how it relates to endtime. A good aim would be to show the relation between those two variables.
df %>%
pivot_longer(cols = c(TStartTime, indicator)) %>%
ggplot()
## all of them are treated, so I am using Outcome as fill variable
# this removes the need for second geom-point and second scale
geom_col(data = df, mapping = aes(y = subjectID, x = endTime, fill = Outcome))
scale_fill_manual(values = c("purple", "gray"))
geom_point(aes(y = subjectID, x = value, colour = name), shape = 18, size = 4)
scale_color_manual(values = c("blue", "green"))
## if you have untreated people, show them in a new facet, e.g., add
facet_grid(~stage)
#> Warning: Removed 12 rows containing missing values (geom_point).
Created on 2022-05-05 by the reprex package (v2.0.1)