Home > OS >  How do I change the color of the regression lines in ggPlot?
How do I change the color of the regression lines in ggPlot?

Time:12-01

I made a visualization of a regression. Currently this is what the graph looks like. enter image description here

The regression lines are hard to see since they are the same color as the scatter plot dots.

My question is, how do I make the regression lines a different color from the scatter plot dots?

Here is my code:

(ggplot(data=df, mapping=aes(x='score', y='relent', 
                                 color='factor(threshold)')) 
       geom_point() 
 scale_color_manual(values=['darkorange', 'purple']) 
 geom_smooth(method='lm',
             formula = 'y ~ x I(x**2)',se=False, ) 
 geom_vline(xintercept = 766, color = "red", size = 1, linetype = "dashed") 
 labs(y = "Yield",
       x = "Score") 
 theme_bw()
)

CodePudding user response:

One option to achieve your desired result would be to "duplicate" your threshold column with different values, e.g. in the code below I map 0 on 2 and 1 on 3. This duplicated column could then be mapped on the color aes inside geom_smooth and allows to set different colors for the regression lines.

My code below uses R or ggplot2 but TBMK the code could be easily adapted to plotnine:

n <- 1000
df <- data.frame(
  relent = c(runif(n, 100, 200), runif(n, 150, 250)),
  score = c(runif(n, 764, 766), runif(n, 766, 768)),
  threshold = c(rep(0, n), rep(1, n))
)
df$threshold_sm <- c(rep(2, n), rep(3, n))

library(ggplot2)

p <- ggplot(data = df, mapping = aes(x = score, y = relent, color = factor(threshold)))  
  scale_color_manual(values = c("darkorange", "purple", "blue", "green"))  
  geom_vline(xintercept = 766, color = "red", size = 1, linetype = "dashed")  
  labs(
    y = "Yield",
    x = "Score"
  )  
  theme_bw()

p  
  geom_point()  
  geom_smooth(aes(color = factor(threshold_sm)),
    method = "lm",
    formula = y ~ x   I(x**2), se = FALSE
  )

A second option would be to add some transparency to the points so that the lines stand out more clearly and by the way deals with the overplotting of the points:

p  
  geom_point(alpha = .3)  
  geom_smooth(aes(color = factor(threshold)),
              method = "lm",
              formula = y ~ x   I(x**2), se = FALSE
  )  
  guides(color = guide_legend(override.aes = list(alpha = 1)))

CodePudding user response:

Compare:

iris %>% 
  ggplot(aes(Petal.Length, Sepal.Width, color = Species))   
  geom_point()   
  geom_smooth(method = "lm", aes(group = Species))

enter image description here

With:

iris %>% 
  ggplot(aes(Petal.Length, Sepal.Width))   
  geom_point(aes(color = Species))   
  geom_smooth(method = "lm", aes(group = Species))

enter image description here

When aes(color = ...) is specified inside of ggplot(), it is applied to both of the subsequent geoms. Moving it to geom_point() applies it to the points only.

  • Related