I'd like to plot the relationship between the number of ladenant
response variable in function of Bioma
(categorical) and temp
(numeric) using binomial negative
generalized linear mixed models (GLMMs) without success. I try to do:
#Packages
library(lme4)
library(ggplot2)
library(ggeffects)
#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
myds <- myds[,-c(3)] # remove bad character variable
# Negative binomial GLMM
m.laden.1 <- glmer.nb(ladenant ~ Bioma poly(temp,2) scale(UR) (1 | formigueiro), data = DataBase)
# Plot the results
mydf <- ggpredict(m.laden.1, terms = c("temp","Bioma"))
ggplot(mydf, aes(x, predicted), group = Bioma)
geom_point(DataBase, aes(temp, ladenant), alpha = 0.5) # Observed ladenant response variable
geom_line()
geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1)
I have a not so good plot because I don't have one line for each Bioma
and bad lines for temp
variable:
But the object mydf
specification contains the Bioma
variable:
mydf
# # Predicted counts of ladenant
# # Bioma = Atlantic Forest
# temp | Predicted | 95% CI
# ---------------------------------
# 10 | 1.88 | [ 0.81, 4.35]
# 15 | 12.95 | [ 9.11, 18.40]
# 20 | 32.61 | [26.42, 40.25]
# 25 | 30.00 | [23.51, 38.28]
# 30 | 10.08 | [ 4.79, 21.24]
# 35 | 1.24 | [ 0.24, 6.43]
# # Bioma = Transition
# temp | Predicted | 95% CI
# ----------------------------------
# 10 | 6.84 | [ 3.04, 15.42]
# 15 | 47.17 | [34.05, 65.34]
# 20 | 118.79 | [92.27, 152.94]
# 25 | 109.29 | [76.84, 155.43]
# 30 | 36.73 | [16.17, 83.44]
# 35 | 4.51 | [ 0.82, 24.71]
# # Bioma = Pampa
# temp | Predicted | 95% CI
# ---------------------------------
# 10 | 1.42 | [ 0.70, 2.90]
# 15 | 9.80 | [ 7.47, 12.86]
# 20 | 24.69 | [18.74, 32.52]
# 25 | 22.71 | [16.46, 31.35]
# 30 | 7.63 | [ 3.65, 15.96]
# 35 | 0.94 | [ 0.19, 4.67]
# Adjusted for:
# * UR = 82.78
# * formigueiro = 0 (population-level)
Plaese, any help to improve this plot?
CodePudding user response:
I think you just need to be careful with the different names of the variables in the two objects myds
and mydf
, and where you place them in the calls to the various geom
s:
library(lme4)
#> Loading required package: Matrix
library(ggplot2)
library(ggeffects)
#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
myds <- myds[,-c(3)] # remove bad character variable
# Negative binomial GLMM
m.laden.1 <- glmer.nb(ladenant ~ Bioma poly(temp,2) scale(UR) (1 | formigueiro),
data = myds)
# Plot the results
mydf <- ggpredict(m.laden.1, terms = c("temp [all]", "Bioma"))
ggplot(mydf, aes(x, predicted))
geom_point(data=myds, aes(temp, ladenant, color = Bioma), alpha = 0.5)
geom_line(aes(color = group))
labs(x = "temp", y = "ladenant")
Note that I haven't included your geom_ribbon
, because the conf.low
and conf.high
are all NA
in the upper half of the curve, which makes it look pretty messy.
Incidentally, the plot might be more informative with a log y scale:
ggplot(mydf, aes(x, predicted))
geom_point(data=myds, aes(temp, ladenant, color = Bioma), alpha = 0.5)
geom_line(aes(color = group))
scale_y_log10()
labs(x = "temp", y = "ladenant")
Created on 2021-11-12 by the reprex package (v2.0.0)