I am trying to plot my predictions for the next 10 years after my data but having troubles with creating this. As you can see from my code below I have successfully done this for the next year after my data but where i seem to be failing is creating my newdata
to plot from here. If anyone know the changes needed here that would be great, i've tried a few things but still only getting it to plot for 1 year into the future. I've added 5 years of data so hopefully that is enough to replicate any issues.
code used;
sst_mod = gam(overall_sst ~ s(timestep, k = 10, bs = "cs") s(month, k = 5, bs = 'cs'),
data = mod_df,
family = gaussian(link = "identity"))
#predictions
sst_preds = predict(sst_mod, TYPE = 'response', se.fit = TRUE)
#vector of year and month in the future
ts = seq(289, 409, length.out = 520) #needs to be set for 10 years, this is where im struggling. I've have 120 months after the current data currently
mon = seq(1,12, length.out = 520)
newdata = data.frame(timestep = ts, month = mon)
new_preds = predict(sst_mod, newdata, type = 'response', se.fit = TRUE)
#plot
ggplot(newdata, aes(x = timestep, y = overall_sst))
geom_line(aes(timestep, new_preds$fit), col = 'red')
data;
overall_sst year month timestep
16.189 1998 1 1
15.667 1998 2 2
15.509 1998 3 3
16.709 1998 4 4
18.822 1998 5 5
22.722 1998 6 6
25.372 1998 7 7
26.597 1998 8 8
25.256 1998 9 9
22.857 1998 10 10
20.242 1998 11 11
17.179 1998 12 12
16.003 1999 1 13
15.140 1999 2 14
15.522 1999 3 15
16.537 1999 4 16
19.658 1999 5 17
23.245 1999 6 18
25.313 1999 7 19
26.753 1999 8 20
26.040 1999 9 21
23.843 1999 10 22
20.940 1999 11 23
17.842 1999 12 24
15.922 2000 1 25
15.257 2000 2 26
15.369 2000 3 27
16.605 2000 4 28
19.737 2000 5 29
23.086 2000 6 30
25.277 2000 7 31
26.161 2000 8 32
25.314 2000 9 33
22.808 2000 10 34
20.608 2000 11 35
18.163 2000 12 36
16.346 2001 1 37
15.706 2001 2 38
16.111 2001 3 39
16.860 2001 4 40
18.966 2001 5 41
22.467 2001 6 42
25.151 2001 7 43
26.701 2001 8 44
25.267 2001 9 45
24.191 2001 10 46
20.929 2001 11 47
17.570 2001 12 48
15.841 2002 1 49
15.694 2002 2 50
15.920 2002 3 51
16.730 2002 4 52
19.109 2002 5 53
22.738 2002 6 54
25.550 2002 7 55
25.965 2002 8 56
25.352 2002 9 57
23.301 2002 10 58
20.497 2002 11 59
17.859 2002 12 60
CodePudding user response:
If you want five years in the future, you need to have a sequence of 60 time steps starting from the timestep after your data's final time step. The months should be a repeating sequence from 1 to 12, i.e. rep(1:12, 5)
.
You could also convert the time steps to dates for plotting, to make this easier to understand, and since you are calculating the standard error, display this too:
library(mgcv)
library(ggplot2)
newdata <- data.frame(timestep = 1:60 max(mod_df$timestep),
month = rep(1:12, 5),
year = rep(max(mod_df$year) 1:5, each = 12))
new_preds <- predict(sst_mod, newdata, type = 'response', se.fit = TRUE)
newdata$ymin <- new_preds$fit - 1.96 * new_preds$se.fit
newdata$ymax <- new_preds$fit 1.96 * new_preds$se.fit
newdata$overall_sst <- new_preds$fit
newdata$date <- as.Date(paste(newdata$year, newdata$month, "1", sep = "-"))
ggplot(newdata, aes(x = date, y = overall_sst))
geom_ribbon(aes(ymin = ymin, ymax = ymax), alpha = 0.4, fill = "red")
geom_line(aes(y = new_preds$fit), col = 'red')
theme_minimal(base_size = 16)