Home > Mobile >  How to add a legend to ggplot2?
How to add a legend to ggplot2?

Time:06-26

I have a dataset with 143 data points. I ran a model over it to predict the last 12 data points and I want to display them, together with the confidence interval, in a chart. The problem is that it is not plotting the legend and I want to include it.

The dataset is as follows. Note that several columns are NA until the 131th entry. It happens because these columns only assign values for the prediction

enter image description here

#Define the chart
grafico = ggplot(data=pd,aes(x=date, y=data))   
 geom_line(col='black')   #Real values
 geom_line(aes(y=`Point Forecast`), col = 'red')   #Predicted values
 geom_ribbon(aes(ymin=`Lo 95`,ymax=`Hi 95`),alpha=.25)   #Confidence interval
 scale_x_date(name='Time Period', expand=c(0,0))   #x label
 scale_y_continuous(name='Male kt')   #y label
 ggtitle("Male kts - Observed and predicted")  
 theme(plot.title = element_text(hjust = 0.5)) #Put the title in the center of the chart
        
#Plot the chart
grafico

The chart below. The red line is the prediction while the black line is the true value

enter image description here

Does anyone have an idea of how to make it work with legend?

CodePudding user response:

A legend is produced in ggplot by mapping an aesthetic (in your case color) to a variable. Typically this is done by having your data in long format and mapping color in a single layer. However, since you have two different line layers, you can just map each one to a character string that you want to appear in the legend. To control the specific colors of the lines, we can use scale_color_manual

library(ggplot2)

grafico = ggplot(pd, aes(date, kt_male))   
 geom_line(aes(color = "Real Values"))   
 geom_line(aes(y=`Point Forecast`, color = "Predicted Values"))   
 geom_ribbon(aes(ymin = `Lo 95`, ymax = `Hi 95`, fill = "Confidence interval"), 
             alpha = 0.25)   
 scale_x_date(name = 'Time Period', expand = c(0, 0))  
 scale_y_continuous(name = 'Male kt')  
 scale_color_manual(values = c("red", "black"))  
 scale_fill_manual(values = "gray30", name = NULL)  
 ggtitle("Male kts - Observed and predicted")  
 theme(plot.title = element_text(hjust = 0.5)) 
        
grafico


Sample data used

set.seed(2)

pd <- data.frame(date = seq(as.Date("2008-01-01"), 
                            as.Date("2020-01-01"), by = "month"),
                 kt_male = cumsum(rnorm(145, 0, 0.25)))
pd$`Point Forecast` <- pd$kt_male   rnorm(145, 0, 0.25)
pd$`Point Forecast`[1:121] <- NA
pd$`Lo 95` <- pd$`Point Forecast` - 0.5
pd$`Hi 95` <- pd$`Point Forecast`   0.5

Created on 2022-06-25 by the reprex package (v2.0.1)

CodePudding user response:

you need to reorganise your data in the long format instead of the wide format, and then replace the ggplot command like this (assuming the variable name is stored in the column category:

ggplot(data=pd,aes(x=date, y=value, color = category))

CodePudding user response:

You can try adding this to the theme function: theme(legend.position = "bottom")

  • Related