I have a GAM with country now added as a factor, to model the relationship within countries (i think this has been done correctly, let me know if not), and now wish to plot my new model how ever my issue is an aesthetics error (seen below), if anyone knows how to correctly plot this that would be great
! Aesthetics must be either length 1 or the same as the data (480): y
This is the code i have used and a snipet of my data (let me know if you need my whole dataset and i can attach it), hopefully someone is able to plot this, i think the error is coming as im not plotting the model correctly now i have added countries as a factor
model code;
mod = gam(gdp_per_capita ~ s(fisheries_production_pc, k = 25, bs = 'cs') as.factor(Country_Name),
data = economy_df,
family = gaussian(link = "log"))
#predictions
preds = predict(mod, type = 'response', se.fit = TRUE)
#plot model
plot = ggplot(economy_df, aes(y = gdp_per_capita, x = fisheries_production_pc))
geom_point()
geom_line(aes(fisheries_production_pc, preds$fit), colour = 'red')
scale_x_log10()
data;
economy_df
Country_Name year gdp_per_capita fisheries_production
Albania 1997 717.3800 1110.80
Albania 1998 813.7894 2807.50
Albania 1999 1033.2425 3057.90
Albania 2000 1126.6833 3635.00
Albania 2001 1281.6598 3597.20
Albania 2002 1425.1242 4516.80
Bosnia 1997 982.8018 253.00
Bosnia 1998 1102.3907 254.00
Bosnia 1999 1251.7476 255.00
Bosnia 2000 1484.1761 255.00
Bosnia 2001 1544.6021 255.00
Croatia 1997 5312.3695 20551.49
Croatia 1998 5691.1095 27935.08
Croatia 1999 5246.9360 25222.19
Croatia 2000 4887.7137 27944.24
Croatia 2001 5412.9251 29019.12
Cyprus 1997 14234.2441 25788.00
Cyprus 1998 15092.8262 20482.00
Cyprus 1999 15287.9189 41060.00
Cyprus 2000 14388.3477 70223.00
CodePudding user response:
I think you can do something like this:
ggplot(economy_df, aes(y = gdp_per_capita,
x = fisheries_production_pc,
color=Country_Name))
geom_point()
facet_wrap(~Country_Name, scales = "free")
geom_smooth(method="gam", formula=y~s(x,bs="cs"))
scale_x_log10()
CodePudding user response:
It is somewhat easier to use the emmeans
package to generate predictions, especially if you want to average (marginalize) over non-focal predictors (e.g. generating a country-independent prediction from a model that has country as a fixed effect); you can also use the ggpredict
package for this (which wraps the emmeans
package)
library(mgcv)
library(emmeans)
library(ggplot2)
mod <- gam(gdp_per_capita ~ s(fisheries_production_pc) factor(Country_Name),
data = economy_df,
family = gaussian(link = "log"))
## predictions
fpcvec <- exp(seq(log(200), log(40000), length.out = 51))
predfun <- function(mod, specs) {
as.data.frame(
emmeans(mod,
specs = specs,
data = economy_df,
type = "response",
at = list(fisheries_production_pc = fpcvec))
)
}
pred1 <- predfun(mod, specs = ~fisheries_production_pc)
pred2 <- predfun(mod, specs = ~fisheries_production_pc*Country_Name)
#plot model
plot <- ggplot(economy_df,
aes(y = gdp_per_capita, x = fisheries_production_pc))
geom_point()
facet_wrap(~Country_Name)
geom_line(data = pred1, aes(y = response))
geom_line(data = pred2, aes(y = response), colour = 'red')
scale_x_log10()
ggsave("fishpred.png")
Notes:
- here I'm drawing both the country-independent (marginal) prediction (in black) and country-specific prediction (in red)
- if you want to add confidence regions you can use
geom_ribbon
with thelower.CL
andupper.CL
variables thatemmeans
generates - since you fitted an additive effect of country, the country-specific predictions are all parallel (on the log scale!) to the overall prediction/each other. (A model with level-specific smooths would have broken with this small data set ...)
- Pedersen et al 2019 (ref below) is a good reference for the choices you should think about when fitting this kind of multi-country GAM.
Pedersen, Eric J., David L. Miller, Gavin L. Simpson, and Noam Ross. 2019. “Hierarchical Generalized Additive Models in Ecology: An Introduction with Mgcv.” PeerJ 7 (May): e6876. https://doi.org/10.7717/peerj.6876.