Home > Net >  How to color scatter plot points that meet 2 or more conditions in different columns in R
How to color scatter plot points that meet 2 or more conditions in different columns in R

Time:01-07

Here is a part of a dataframe that I have

value1 value2 condition1 condition2 condition3
2.3 0.1 FALSE FALSE TRUE
3.5 2.6 FALSE FALSE TRUE
3.1 2.5 TRUE TRUE TRUE
3.2 2.3 FALSE TRUE TRUE
2.4 1.1 TRUE TRUE FALSE
2.7 2.2 FALSE TRUE FALSE
2.5 3 TRUE FALSE TRUE
2.9 2 TRUE TRUE TRUE
4.2 1 FALSE FALSE TRUE
2.2 1.5 FALSE TRUE TRUE

I would like to plot a scatter plot of value1 vs value2 and color the points that have 2 or more TRUE conditions,

Do you have any suggestions on how to do this(using ggplot2 and the tidyverse)? Thank you for your time and help

I have tried to group the conditions with group_by but I have not been successful.

CodePudding user response:

We may get the color by using rowSums on the condition columns and then use plot from base R

colr <- ifelse(rowSums(df1[3:5]) > 1, "red", "black")
plot(df1$value1, df1$value2, col = colr, xlab = "value1", ylab = "value2")

Or using tidyverse, create colour column based on rowSums on the logical columns, then use geom_point with colour = colr column and add scale_colour_identity()to use the already existing scaled data

library(dplyr)
library(ggplot2)
df1 %>% 
  mutate(colr = case_when(rowSums(across(starts_with("condition"))) > 1
     ~ "red", TRUE ~ "black")) %>% 
  ggplot(aes(value1, value2, colour = colr))  
   geom_point()  
    scale_colour_identity()

data

df1 <- structure(list(value1 = c(2.3, 3.5, 3.1, 3.2, 2.4, 2.7, 2.5, 
2.9, 4.2, 2.2), value2 = c(0.1, 2.6, 2.5, 2.3, 1.1, 2.2, 3, 2, 
1, 1.5), condition1 = c(FALSE, FALSE, TRUE, FALSE, TRUE, FALSE, 
TRUE, TRUE, FALSE, FALSE), condition2 = c(FALSE, FALSE, TRUE, 
TRUE, TRUE, TRUE, FALSE, TRUE, FALSE, TRUE), condition3 = c(TRUE, 
TRUE, TRUE, TRUE, FALSE, FALSE, TRUE, TRUE, TRUE, TRUE)),
 class = "data.frame", row.names = c(NA, 
-10L))
  • Related