Home > Software design >  Computing marginal effects: Why does ggeffect and ggemmeans give difference answers?
Computing marginal effects: Why does ggeffect and ggemmeans give difference answers?

Time:06-27

Example

library(glmmTMB)
library(ggeffects)

## Zero-inflated negative binomial model
(m <- glmmTMB(count ~ spp   mined   (1|site),
              ziformula=~spp   mined, 
              family=nbinom2, 
              data=Salamanders, 
              na.action = "na.fail"))
summary(m)

ggemmeans(m, terms="spp")
spp   | Predicted |       95% CI
--------------------------------
GP    |      1.11 | [0.66, 1.86]
PR    |      0.42 | [0.11, 1.59]
DM    |      1.32 | [0.81, 2.13]
EC-A  |      0.75 | [0.37, 1.53]
EC-L  |      1.81 | [1.09, 3.00]
DES-L |      2.00 | [1.25, 3.21]
DF    |      0.99 | [0.61, 1.62]

ggeffects::ggeffect(m, terms="spp")
spp   | Predicted |       95% CI
--------------------------------
GP    |      1.14 | [0.69, 1.90]
PR    |      0.44 | [0.12, 1.63]
DM    |      1.36 | [0.85, 2.18]
EC-A  |      0.78 | [0.39, 1.57]
EC-L  |      1.87 | [1.13, 3.07]
DES-L |      2.06 | [1.30, 3.28]
DF    |      1.02 | [0.63, 1.65]

Questions

Why are ggeffect and ggemmeans giving different results for the marginal effects? Is it simply something internal with how the packages emmeans and effects are computing them? Also, does anyone know of some resources on how to compute marginal effects from scratch for a model like that in the example?

CodePudding user response:

You fit a complex model: zero-inflated negative binomial model with random effects.

What you observe has little to do with the model specification. Let's show this by fitting a simpler model: Poisson with fixed effects only.

library("glmmTMB")
library("ggeffects")

m <- glmmTMB(
  count ~ spp   mined,
  family = poisson,
  data = Salamanders
)

ggemmeans(m, terms = "spp")
#> # Predicted counts of count
#> 
#> spp   | Predicted |       95% CI
#> --------------------------------
#> GP    |      0.73 | [0.59, 0.89]
#> PR    |      0.18 | [0.12, 0.27]
#> DM    |      0.91 | [0.76, 1.10]
#> EC-A  |      0.34 | [0.25, 0.45]
#> EC-L  |      1.35 | [1.15, 1.59]
#> DES-L |      1.43 | [1.22, 1.68]
#> DF    |      0.79 | [0.64, 0.96]

ggeffect(m, terms = "spp")
#> # Predicted counts of count
#> 
#> spp   | Predicted |       95% CI
#> --------------------------------
#> GP    |      0.76 | [0.62, 0.93]
#> PR    |      0.19 | [0.13, 0.28]
#> DM    |      0.96 | [0.79, 1.15]
#> EC-A  |      0.35 | [0.26, 0.47]
#> EC-L  |      1.41 | [1.20, 1.66]
#> DES-L |      1.50 | [1.28, 1.75]
#> DF    |      0.82 | [0.67, 1.00]

The documentation explains that internally ggemmeans() calls emmeans::emmeans() while ggeffect() calls effects::Effect().

Both emmeans and effects compute marginal effects but it turns out that they make a different (default) choice how to marginalize out (ie. average over) mined in order to get the effect of spp.

mined is a categorical variable with two levels: "yes" and "no". The crucial bit is that the two levels are not balanced: there are slightly more "no"s than "yes"s.

xtabs(~ mined   spp, data = Salamanders)
#>      spp
#> mined GP PR DM EC-A EC-L DES-L DF
#>   yes 44 44 44   44   44    44 44
#>   no  48 48 48   48   48    48 48

Intuitively, this means that the weighted average over mined [think of (44 × yes 48 × no) / 92] is not the same as the simple average over mined [think of (yes no) / 2].

Let's check the intuition by specifying how to marginalize out mined when we call emmeans::emmeans() directly.

# mean (default)
emmeans::emmeans(m, "spp", type = "response", weights = "equal")
#>  spp    rate     SE  df lower.CL upper.CL
#>  GP    0.726 0.0767 636    0.590    0.893
#>  PR    0.181 0.0358 636    0.123    0.267
#>  DM    0.914 0.0879 636    0.757    1.104
#>  EC-A  0.336 0.0497 636    0.251    0.449
#>  EC-L  1.351 0.1120 636    1.148    1.590
#>  DES-L 1.432 0.1163 636    1.221    1.679
#>  DF    0.786 0.0804 636    0.643    0.961
#> 
#> Results are averaged over the levels of: mined 
#> Confidence level used: 0.95 
#> Intervals are back-transformed from the log scale

# weighted mean
emmeans::emmeans(m, "spp", type = "response", weights = "proportional")
#>  spp    rate     SE  df lower.CL upper.CL
#>  GP    0.759 0.0794 636    0.618    0.932
#>  PR    0.190 0.0373 636    0.129    0.279
#>  DM    0.955 0.0909 636    0.793    1.152
#>  EC-A  0.351 0.0517 636    0.263    0.469
#>  EC-L  1.412 0.1153 636    1.203    1.658
#>  DES-L 1.496 0.1196 636    1.279    1.751
#>  DF    0.822 0.0832 636    0.674    1.003
#> 
#> Results are averaged over the levels of: mined 
#> Confidence level used: 0.95 
#> Intervals are back-transformed from the log scale

The second option returns the marginal effects computed with ggeffects::ggeffect.

Created on 2022-06-26 by the reprex package (v2.0.1)

  • Related