Home > front end >  Plotting predictions for future years
Plotting predictions for future years

Time:11-13

I am trying to plot my predictions for the next 10 years after my data but having troubles with creating this. As you can see from my code below I have successfully done this for the next year after my data but where i seem to be failing is creating my newdata to plot from here. If anyone know the changes needed here that would be great, i've tried a few things but still only getting it to plot for 1 year into the future. I've added 5 years of data so hopefully that is enough to replicate any issues.

code used;

sst_mod = gam(overall_sst ~ s(timestep, k = 10, bs = "cs")   s(month, k = 5, bs = 'cs'), 
             data = mod_df, 
             family = gaussian(link = "identity"))
#predictions
sst_preds = predict(sst_mod, TYPE = 'response', se.fit = TRUE)

#vector of year and month in the future
ts = seq(289, 409, length.out = 520) #needs to be set for 10 years, this is where im struggling. I've have 120 months after the current data currently
mon = seq(1,12, length.out = 520)
newdata = data.frame(timestep = ts, month = mon)
new_preds = predict(sst_mod, newdata, type = 'response', se.fit = TRUE)
#plot
ggplot(newdata, aes(x = timestep, y = overall_sst))   
  geom_line(aes(timestep, new_preds$fit), col = 'red')

data;

overall_sst   year   month   timestep
16.189        1998     1        1
15.667        1998     2        2
15.509        1998     3        3
16.709        1998     4        4
18.822        1998     5        5
22.722        1998     6        6
25.372        1998     7        7
26.597        1998     8        8
25.256        1998     9        9
22.857        1998    10       10
20.242        1998    11       11
17.179        1998    12       12
16.003        1999     1       13
15.140        1999     2       14
15.522        1999     3       15
16.537        1999     4       16
19.658        1999     5       17
23.245        1999     6       18
25.313        1999     7       19
26.753        1999     8       20
26.040        1999     9       21
23.843        1999    10       22
20.940        1999    11       23
17.842        1999    12       24
15.922        2000     1       25
15.257        2000     2       26
15.369        2000     3       27
16.605        2000     4       28
19.737        2000     5       29
23.086        2000     6       30
25.277        2000     7       31
26.161        2000     8       32
25.314        2000     9       33
22.808        2000    10       34
20.608        2000    11       35
18.163        2000    12       36
16.346        2001     1       37
15.706        2001     2       38
16.111        2001     3       39
16.860        2001     4       40
18.966        2001     5       41
22.467        2001     6       42
25.151        2001     7       43
26.701        2001     8       44
25.267        2001     9       45
24.191        2001    10       46
20.929        2001    11       47
17.570        2001    12       48
15.841        2002     1       49
15.694        2002     2       50
15.920        2002     3       51
16.730        2002     4       52
19.109        2002     5       53
22.738        2002     6       54
25.550        2002     7       55
25.965        2002     8       56
25.352        2002     9       57
23.301        2002    10       58
20.497        2002    11       59
17.859        2002    12       60

CodePudding user response:

If you want five years in the future, you need to have a sequence of 60 time steps starting from the timestep after your data's final time step. The months should be a repeating sequence from 1 to 12, i.e. rep(1:12, 5).

You could also convert the time steps to dates for plotting, to make this easier to understand, and since you are calculating the standard error, display this too:

library(mgcv)
library(ggplot2)

newdata <- data.frame(timestep = 1:60   max(mod_df$timestep), 
                      month = rep(1:12, 5), 
                      year = rep(max(mod_df$year)   1:5, each = 12))

new_preds <- predict(sst_mod, newdata, type = 'response', se.fit = TRUE)
newdata$ymin <- new_preds$fit - 1.96 * new_preds$se.fit
newdata$ymax <- new_preds$fit   1.96 * new_preds$se.fit
newdata$overall_sst <- new_preds$fit
newdata$date <- as.Date(paste(newdata$year, newdata$month, "1", sep = "-"))

ggplot(newdata, aes(x = date, y = overall_sst))  
  geom_ribbon(aes(ymin = ymin, ymax = ymax), alpha = 0.4, fill = "red")  
  geom_line(aes(y = new_preds$fit), col = 'red')  
  theme_minimal(base_size = 16)

enter image description here

  • Related