Home > Software design >  Ho to color specific values in scatter plot in R
Ho to color specific values in scatter plot in R

Time:01-20

A sample of my data is:

df<-read.table (text=" No   value
1   -1.25
2   -0.9
3   0.91
4   2.39
5   1.54
6   1.87
7   -2.5
8   -1.73
9   1.26
10  -2.1

", header=TRUE)

The numbers outside of -2 and 2 should be coloured, let's say, red. In this example, the number are 4,7 and 10, Here is my effort :

ggplot(df, aes(x=No, y=value))  
  theme_bw() geom_text(aes(label=No)) 
  geom_hline(yintercept=2, linetype="dashed", color = "red") 
  geom_hline(yintercept=-2, linetype="dashed", color = "red")

CodePudding user response:

Use ggplot2's aesthetics for color= (and a manual color scale).

ggplot(df, aes(x=No, y=value))  
  theme_bw()   geom_text(aes(label=No, color=abs(value)>2)) 
  geom_hline(yintercept=2, linetype="dashed", color = "red") 
  geom_hline(yintercept=-2, linetype="dashed", color = "red") 
  scale_color_manual(values = c("FALSE" = "black", "TRUE" = "red"))

ggplot2 with outliers colored red

Reduction: you can combine your geom_hline's if you'd like,

ggplot(df, aes(x=No, y=value))  
  theme_bw()   geom_text(aes(label=No, color=abs(value)>2)) 
  geom_hline(yintercept=c(-2,2), linetype="dashed", color = "red") 
  scale_color_manual(values = c("FALSE" = "black", "TRUE" = "red"))

In general, I prefer to use as few geom_*s as strictly required, relying more in ggplot2's internal grouping and aesthetic handling: it is robust, elegant, and at times more flexible when the data changes. There are certainly times when I use multiple geom_* calls and bespoke subsets of the data for each, so it's not a broken paradigm.

The naming of the legend is unlikely to be satisfactory in the long term. You can remove it entirely with ... guides(color="none"), or you can pre-process the variable in as Tom's answer demonstrates, providing a way to control the name of the group and its apparent levels.

CodePudding user response:

You could create two geom_text by subset your data twice based on your conditions like this:

library(ggplot2)
ggplot()  
  geom_text(data = subset(df, value >=2 | value <= -2), 
            aes(x=No, y=value, label = No), color = "red")  
  geom_text(data = subset(df, value < 2 & value > -2), 
            aes(x=No, y=value, label = No))  
  geom_hline(yintercept=2, linetype="dashed", color = "red") 
  geom_hline(yintercept=-2, linetype="dashed", color = "red") 
  theme_bw()

Created on 2023-01-19 with enter image description here

  • Related