Home > Software engineering >  Draw Shape Around Consecutive Values Meeting Certain Criteria in ggplot2
Draw Shape Around Consecutive Values Meeting Certain Criteria in ggplot2

Time:12-02

I have a plot made in R similar to the one below. I have some values plotted in addition to the mean of those values. I want to draw a rectangle around values that are consecutively 5 times on either side of the mean. I'm having a hard time accomplishing this, any insight would be appreciated.

EDIT: I should clarify that I'm trying to have this done automatically, I don't want to manually set the coordinates of the rectangles.

Reproducible example

 library(ggplot2)
    
mydata <- data.frame(
      id = c(1:20),
      result = c(102,99,102,99,102, rep.int(105,5), 102,99,102,99,102, rep.int(95,5))
    )
    
    mymean <- 100
    
    ggplot(mydata, aes(x = id, y = result))  
      geom_point()  
      geom_hline(yintercept = mymean)

Current Plot: Current Plot

Desired Plot: Desired Plot

CodePudding user response:

You will first need an algorithm to go through the data and classify it into different groups depending on your criteria (the hardest part of this). Then, you can take the result of that and use geom_rect() to add the rectangles to the chart. The function classify_data() below does the classification, and the rectangles are added with geom_rect(). If you have points that fall on the mean and you don't want to group them in a rectangle, you can add a condition to test for that.

classify_groups <- function(df, id_col = "id", val_col = "result") {
  id <- df[[id_col]]
  values <- df[[val_col]]

  current_ids <- c()
  current_values <- c()
  result <- data.frame(id = numeric(),
                       xmin = numeric(),
                       xmax = numeric(),
                       y = numeric())
  
  for (i in 1:(length(id))) {
    if (length(current_values) == 0 | values[i] %in% current_values) {
      current_values <- c(current_values, values[i])
      current_ids <- c(current_ids, id[i])
      
    } else {
      current_ids <- c(id[i])
      current_values <- c(values[i])
    }
    
    if (length(current_values) == 5) {
      result <- result %>%
        add_row(id = i,
                xmin = min(current_ids),
                xmax = max(current_ids),
                y = max(current_values))
      current_ids <-c()
      current_values <- c()
    }
  }
  
  result
}

mydata <- data.frame(
  id = c(1:20),
  result = c(102,99,102,99,102, rep.int(105,5), 102,99,102,99,102, rep.int(95,5))
)

mymean = 100

groups <- classify_groups(mydata)

ggplot(mydata, aes(x = id, y = result))  
  geom_point()  
  geom_hline(yintercept = mymean)  
  geom_rect(data = groups,
            aes(xmin = xmin - 0.5,
                xmax = xmax   0.5,
                ymin = y - 0.5,
                ymax = y   0.5,
                group = id),
            alpha = 0,
            color = 'darkorange',
            size=1,
            inherit.aes = FALSE)

Result:

enter image description here

  • Related