Home > Back-end >  Plot generalized linear mixed models (GLMMs): mixture of categorical and numeric variables
Plot generalized linear mixed models (GLMMs): mixture of categorical and numeric variables

Time:11-13

I'd like to plot the relationship between the number of ladenant response variable in function of Bioma (categorical) and temp (numeric) using binomial negative generalized linear mixed models (GLMMs) without success. I try to do:

#Packages
library(lme4)
library(ggplot2)
library(ggeffects)

#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
myds <- myds[,-c(3)] # remove bad character variable

# Negative binomial GLMM
m.laden.1  <- glmer.nb(ladenant ~ Bioma    poly(temp,2)   scale(UR)   (1 | formigueiro), data = DataBase)

# Plot the results
mydf <- ggpredict(m.laden.1, terms = c("temp","Bioma"))
ggplot(mydf, aes(x, predicted), group = Bioma)  
  geom_point(DataBase, aes(temp, ladenant), alpha = 0.5)   # Observed ladenant response variable
  geom_line()  
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1)

I have a not so good plot because I don't have one line for each Bioma and bad lines for temp variable:

my-plot

But the object mydf specification contains the Bioma variable:

mydf 

# # Predicted counts of ladenant

# # Bioma = Atlantic Forest

# temp | Predicted |         95% CI
# ---------------------------------
#   10 |      1.88 | [ 0.81,  4.35]
#   15 |     12.95 | [ 9.11, 18.40]
#   20 |     32.61 | [26.42, 40.25]
#   25 |     30.00 | [23.51, 38.28]
#   30 |     10.08 | [ 4.79, 21.24]
#   35 |      1.24 | [ 0.24,  6.43]

# # Bioma = Transition

# temp | Predicted |          95% CI
# ----------------------------------
#   10 |      6.84 | [ 3.04,  15.42]
#   15 |     47.17 | [34.05,  65.34]
#   20 |    118.79 | [92.27, 152.94]
#   25 |    109.29 | [76.84, 155.43]
#   30 |     36.73 | [16.17,  83.44]
#   35 |      4.51 | [ 0.82,  24.71]

# # Bioma = Pampa

# temp | Predicted |         95% CI
# ---------------------------------
#   10 |      1.42 | [ 0.70,  2.90]
#   15 |      9.80 | [ 7.47, 12.86]
#   20 |     24.69 | [18.74, 32.52]
#   25 |     22.71 | [16.46, 31.35]
#   30 |      7.63 | [ 3.65, 15.96]
#   35 |      0.94 | [ 0.19,  4.67]

# Adjusted for:
# *          UR = 82.78
# * formigueiro = 0 (population-level)

Plaese, any help to improve this plot?

CodePudding user response:

I think you just need to be careful with the different names of the variables in the two objects myds and mydf, and where you place them in the calls to the various geoms:

library(lme4)
#> Loading required package: Matrix
library(ggplot2)
library(ggeffects)

#Open my dataset
myds<-read.csv("https://raw.githubusercontent.com/Leprechault/trash/main/my_glmm_dataset.csv")
myds <- myds[,-c(3)] # remove bad character variable

# Negative binomial GLMM
m.laden.1 <- glmer.nb(ladenant ~ Bioma    poly(temp,2)   scale(UR)   (1 | formigueiro),
                      data = myds)

# Plot the results
mydf <- ggpredict(m.laden.1, terms = c("temp [all]", "Bioma"))

ggplot(mydf, aes(x, predicted))  
  geom_point(data=myds, aes(temp, ladenant, color = Bioma), alpha = 0.5)   
  geom_line(aes(color = group))  
  labs(x = "temp", y = "ladenant")

Note that I haven't included your geom_ribbon, because the conf.low and conf.high are all NA in the upper half of the curve, which makes it look pretty messy.

Incidentally, the plot might be more informative with a log y scale:

ggplot(mydf, aes(x, predicted))  
  geom_point(data=myds, aes(temp, ladenant, color = Bioma), alpha = 0.5)   
  geom_line(aes(color = group))  
  scale_y_log10()  
  labs(x = "temp", y = "ladenant")

enter image description here

Created on 2021-11-12 by the reprex package (v2.0.0)

  • Related