Home > Back-end >  How to add legends manually for geom_point in ggplot in R? - two scales for the same aesthetic
How to add legends manually for geom_point in ggplot in R? - two scales for the same aesthetic

Time:05-05

I am trying to add the legends to the several geom_point. Although, as I have three geom_points the labels appear only for one variable ("Outcome").

Additionally to the "Outcome" variable, I want to show the labels for two diamonds: the blue diamond ("TStartTime") and the green ("indicator").

Descriptive figure

# Code for numbers reproduction
df <- data.frame(subjectID = factor(1:10, 10:1),
                   stage = rep(c("treated"), times = c(10)),
                   endTime = c(6, 8, 3, 5, 10, 14, 2, 12, 6, 6),
                   Outcome = rep(c("healthy", "disability", "healthy", "disability", NA, NA, NA, NA, "healthy", "disability"), 1),
                   TStartTime=c(1.0, 1.5, 0.3, 0.9, NA, NA, NA, NA, NA, NA),
                   TEndTime=c(6.0, 7.0, 1.2, 1.4, NA, NA, NA, NA, NA, NA),
                   TimeZero=c(0,0,0,0,0,0,0,0,0,0),
                   ind=rep(c(!0, !0, !0, !0, !0), times = c(2, 2, 2, 2, 2)),
                   Garea=c(1.0, 1.5, 0.3, 0.9, 2, 2, NA, NA, NA, NA),
                   indicator=c(NA, NA, NA, NA, 4, 1, 5, 2, NA, NA))
# Code for the plot
gg <- ggplot(df, aes(subjectID, endTime))   
  scale_fill_manual(values = c("khaki", "orange"))    
  geom_col(aes(fill = factor(stage)))   
  
  geom_point(data=df, aes(subjectID, TStartTime), colour = c("blue"), fill =alpha(c("#FAFAFA"), 0.2), shape=18, size=4)  
  coord_flip()   # blue diamond
  
  geom_point(data=df, aes(subjectID, indicator), colour = c("green"), shape=18, size=4)  
  coord_flip()   # green diamond for indicator
  
  
  geom_point(aes(colour = Outcome, shape = Outcome),  size = 4)  
  coord_flip()  
  scale_colour_manual(values = c('purple','gray'), na.translate=FALSE)   
  scale_y_continuous(limits = c(-0.2, 15), breaks = 0:15)   
  labs(labels= "",
       x       = "ID ", 
       fill    = "Status",
       y       = "Days",
       title   = "Plot")  
  theme_classic() 
  theme(plot.title   = element_text(hjust = 0.5),
        plot.caption = element_text(size = 7, hjust = 0))

CodePudding user response:

You're basically looking for a second scale for the same aesthetic. ggnewscale is your friend. Many other comments in the code. In particular, you've called coord_flip many times, this is not necessary and possibly even dangerous. I'd avoid coord_flip altogether (see my comments in the code how to do that).

All this technical aspect aside - your visualisation doesn't seem quite ideal to me, and rather confusing. I wonder if there might not be more intuitive ways to present your various variables - maybe consider facets. A suggestion below.

library(tidyverse)
library(ggnewscale)

df <- data.frame(
  subjectID = factor(1:10, 10:1),
  stage = rep(c("treated"), times = c(10)),
  endTime = c(6, 8, 3, 5, 10, 14, 2, 12, 6, 6),
  Outcome = rep(c("healthy", "disability", "healthy", "disability", NA, NA, NA, NA, "healthy", "disability"), 1),
  TStartTime = c(1.0, 1.5, 0.3, 0.9, NA, NA, NA, NA, NA, NA),
  TEndTime = c(6.0, 7.0, 1.2, 1.4, NA, NA, NA, NA, NA, NA),
  TimeZero = c(0, 0, 0, 0, 0, 0, 0, 0, 0, 0),
  ind = rep(c(!0, !0, !0, !0, !0), times = c(2, 2, 2, 2, 2)),
  Garea = c(1.0, 1.5, 0.3, 0.9, 2, 2, NA, NA, NA, NA),
  indicator = c(NA, NA, NA, NA, 4, 1, 5, 2, NA, NA)
)

# pivot longer so you can combine tstarttime and indicator into one legend easily
df %>%
  pivot_longer(cols = c(TStartTime, indicator)) %>%
  # remove all the coord_flip calls (you only need one, if not none!)
  ggplot()  
  scale_fill_manual(values = c("khaki", "orange"))  
  # just change the x/y aesthetic in geom_col
  # geom_col would add all values together, so you need to use the un-pivoted data
  geom_col(data = df, mapping = aes(y = subjectID, x = endTime, fill = factor(stage)))  
  # now you only need one geom_point for the new scale, but use the variable in aes()
  geom_point(aes(y = subjectID, x = value, colour = name), shape = 18, size = 4)  
  scale_color_manual(values = c("blue", "green"))  
  # now add a new scale for the same aesthetic (color)
  new_scale_color()  
  geom_point(aes(y = subjectID, x = endTime, colour = Outcome, shape = Outcome), size = 4)  
  ## removing na.translate = FALSE avoids the duplicate legend for outcome
  scale_colour_manual(values = c("purple", "gray"))
#> Warning: Removed 12 rows containing missing values (geom_point).
#> Warning: Removed 8 rows containing missing values (geom_point).

Visualising less dimensions / variables is sometimes better. Here a suggestion how to avoid double scales for the same aesthetic and using your color maybe more convincingly. I feel the use of bars might also not be ideal, but this really depends on what the variable "indicator/ttimestart" is and how it relates to endtime. A good aim would be to show the relation between those two variables.

df %>%
  pivot_longer(cols = c(TStartTime, indicator)) %>%
  ggplot()  
  ## all of them are treated, so I am using Outcome as fill variable
  # this removes the need for second geom-point and second scale
  geom_col(data = df, mapping = aes(y = subjectID, x = endTime, fill = Outcome))  
  scale_fill_manual(values = c("purple", "gray"))  
  geom_point(aes(y = subjectID, x = value, colour = name), shape = 18, size = 4)  
  scale_color_manual(values = c("blue", "green"))  
## if you have untreated people, show them in a new facet, e.g., add 
  facet_grid(~stage)
#> Warning: Removed 12 rows containing missing values (geom_point).

Created on 2022-05-05 by the reprex package (v2.0.1)

  • Related