Home > database >  adding different columns to ggplot with for loop
adding different columns to ggplot with for loop

Time:10-26

I'm trying to plot some test errors for different models using ggplot but it seems the for loop is just replacing the last term everytime:

library(ggplot2)
test <- data.table('obs'= rep(0,10), 'exp'=rnorm(10) ,'pred1'=rnorm(10), 'pred2'=rnorm(10), 'pred3'=rnorm(10), date=1:10)
error_plot <- ggplot()   geom_point(aes(x=date, y = exp - obs), data = test, colour = "black") 

pred_names <- paste0('pred', 1:3)
colours_plot <- c('green', 'blue', 'yellow', 'purple')

for (i in 1:length(pred_names)){
  error_plot <- error_plot   geom_point(aes(x=date, y = get(pred_names[i]) - obs), data = test, colour = colours_plot[i])
  print(error_plot)
}

If i run without the loop, everything is fine:

error_plot <- error_plot   geom_point(aes(x=date, y = get(pred_names[1]) - obs), data = test, colour = colours_plot[1])  
  geom_point(aes(x=date, y = get(pred_names[2]) - obs), data = test, colour = colours_plot[2])  
  geom_point(aes(x=date, y = get(pred_names[3]) - obs), data = test, colour = colours_plot[3]) 

CodePudding user response:

Since the rendering is applied lazily, i is not resolved fully until the time it is rendered, at which point i has been changed. Fortunately, ggplot2 can add a list of geoms as well, so we can use lapply and family to create a list that is fully "realized". Depending on your preference of iteration styles, choose one of:

error_plot  
  lapply(seq_along(pred_names), function(i) {
    geom_point(aes(x = date, y = get(pred_names[i]) - obs),
               data = test, colour = colours_plot[i])
  })
## or ##
error_plot  
  Map(function(pn, cn) {
    geom_point(aes(x = date, y = get(pn) - obs), data = test, colour = cn)
  }, pred_names, colours_plot[1:3])

(Note that your pred_names is shorter than colours_plot, ergo the need for [1:3] in the Map-version.)

But perhaps a more ggplot2-canonical method would be to use long-data for your points, which allows fewer calls, optionally a legend (which I've disabled here), and several other things that aestheticized variables can accomplish:

testlong <- melt(test, id.vars = c("date", "obs", "exp"), variable.name = "pred")
testlong
#      date   obs         exp   pred       value
#     <int> <num>       <num> <fctr>       <num>
#  1:     1     0  0.43281803  pred1  0.27655075
#  2:     2     0 -0.81139318  pred1  0.67928882
#  3:     3     0  1.44410126  pred1  0.08983289
#  4:     4     0 -0.43144620  pred1 -2.99309008
# ---
# 27:     7     0 -0.78383894  pred3  0.25792144
# 28:     8     0  1.57572752  pred3  0.08844023
# 29:     9     0  0.64289931  pred3 -0.12089654
# 30:    10     0  0.08976065  pred3 -1.19432890
#      date   obs         exp   pred       value

ggplot(test)  
  geom_point(aes(x = date, y = exp - obs), colour = "black")   
  geom_point(aes(x = date, y = value - obs, colour = pred), data = testlong)  
  scale_colour_manual(guide = FALSE, values = setNames(colours_plot[1:3], pred_names))

I think in this case we should not use testlong for the original geom_point, as that would triple-plot each of those points. One could always mitigate that with unique if you wanted to go all-in with a single frame.

  • Related