Home > Software design >  Plotting geom_line() with color conditioning creates lines where value is false. How do I remove?
Plotting geom_line() with color conditioning creates lines where value is false. How do I remove?

Time:12-24

I am trying to plot a price chart that takes in a condition based on a column value and attributes a color. Though it does that, there are annoying straight lines connecting the segments. How can I remove those 'connecting lines'?

Chart

Code checks for a value in column 'skew' and creates Signal column with values of yes or no. Plot then colors based on the Signal column value.

df$skew     <- Skew
df$kurtosis <- Kurtosis

df <- df %>% mutate(Signal = if_else(skew >= 1, 'Yes', 'No'))

#plot
pPrice <- ggplot()  
  ggtitle(df$symbol)  
  geom_line(data = df, aes(x = date, y = adjusted, color = Signal))  
  xlab('')   ylab('Price')  
  theme_bw()

CodePudding user response:

The straight lines are probably due to missing values in the data, which have been removed prior to the plot. For example,

data(airquality)
data <- transform(airquality, 
                  date=as.Date(paste("1973",Month,Day,sep="-")),
                  windy=Wind>mean(Wind))
head(data)
      Ozone Solar.R Wind Temp Month Day       date windy
1    41     190  7.4   67     5   1 1973-05-01 FALSE
2    36     118  8.0   72     5   2 1973-05-02 FALSE
3    12     149 12.6   74     5   3 1973-05-03  TRUE
4    18     313 11.5   62     5   4 1973-05-04  TRUE
5    NA      NA 14.3   56     5   5 1973-05-05  TRUE
6    28      NA 14.9   66     5   6 1973-05-06  TRUE

library(ggplot2)
library(dplyr)

filter(data, !is.na(Ozone)) %>%
  ggplot(aes(x=date, y=Ozone, col=windy))  
  geom_line(lwd=1)

ggplot(data, aes(x=date, y=Ozone, col=windy))  
  geom_line(lwd=1) 

Which one you prefer is subjective.

enter image description here

CodePudding user response:

If I understand you correctly, each point on the plot is assigned to either 'Yes' or 'No', and you would like the line to change color according to this. In geom_line, the color aesthetic cannot vary along the length of the line. Perhaps the neatest way to achieve this effect is to plot the whole thing as a series of line segments, where the start point of each segment is your data, and the end point is the lagged data:

library(tidyverse)

ggplot(df)  
  ggtitle(df$symbol)  
  geom_segment(aes(x = date, y = adjusted, xend = lag(date),
                   yend = lag(adjusted), color = Signal))  
  xlab('')   
  ylab('Price')  
  theme_bw()

enter image description here

Incidentally, I think this is a difficult plot to read. If you want to be able to read off the 'yes' and 'no' values more easily, you might consider using a geom_rect background instead:

ggplot(df)  
  ggtitle(df$symbol)  
  geom_rect(aes(xmin = date, xmax = lag(date), ymin = -Inf, ymax = Inf,
                fill = Signal), alpha = 0.2)  
  geom_line(aes(x = date, y = adjusted))  
  xlab('')   
  ylab('Price')  
  theme_bw()

enter image description here


Data used

You did not provide any data for this example, so I created some with the same names and similar values using the following code:

set.seed(5)

df <- data.frame(date = seq(as.Date('2019-01-01'), 
                            as.Date('2022-12-22'), 'day'),
                 adjusted = round(cumsum(rnorm(1452))   150, 1),
                 symbol = 'QQQ')

df$skew <- 0

for(i in seq(nrow(df))[-1]) {
  df$skew[i] <- ifelse(df$skew[i - 1] == 0, 
                       sample(0:1, 1, prob = c(0.9, 0.1)),
                       sample(1:0, prob = c(0.9, 0.1)))
}

df <- df %>% mutate(Signal = if_else(skew >= 1, 'Yes', 'No'))
  • Related