Home > Enterprise >  Plotting colors by value of a variable not being plotted in {ggplot2}
Plotting colors by value of a variable not being plotted in {ggplot2}

Time:10-28

I am trying to code a plot using the data frame 'swiss' from {datasets} using {ggplot2}. I am plotting Infant.Mortality on the x-axis and Fertility on the y-axis, and I want the points to be colored such that they are a transparent blue or orange depending on if they are above or below the median value for Education. However, when I plot, I only get transparent blue points and the legend titles are off.

This is the code I have to far:

swiss$color[swiss$Education >= median(swiss$Education)] <- tBlue 
swiss$color[swiss$Education < median(swiss$Education)] <- tOrange

ggplot(data = swiss)   
 geom_point(mapping = aes(x = Infant.Mortality, y = Fertility, color = color))   
 scale_color_manual(values = swiss$color,
                    labels = ">= median", "<median")

I've also tried what was explained in this question (output

CodePudding user response:

With ggplot we don't normally create column of color names (this is common in base graphics). Instead, the usual way is to create a column in your data with meaningful labels, like this:

swiss$edu_med = ifelse(swiss$Education >= median(swiss$Education), ">= Median", "< Median")

ggplot(data = swiss)   
 geom_point(mapping = aes(x = Infant.Mortality, y = Fertility, color = edu_med))   
 scale_color_manual(values = c(tblue, torange))

The legend labels will be automatically generated from the data values.

It is possible to do it the way you have in the question, in this case use scale_color_identity(labels = ">= median", "< median") instead of scale_color_manual().

  • Related