I would like to compute so-called first- and second- differences (as defined e.g. here) with the package marginaleffects
, with standard errors, but I am not sure which function I can use.
Consider this example model:
library(marginaleffects)
mod <- lm(mpg ~ hp * factor(cyl), data = mtcars)
pred <- predictions(mod, newdata = datagrid(cyl = levels(factor(mtcars$cyl)),
hp = seq(80, 110, 10)))
# rowid type predicted std.error statistic p.value conf.low conf.high cyl hp mpg
# 1 1 response 26.96095 0.9210528 29.271888 2.364796e-188 25.06770 28.85421 4 80 21
# 2 2 response 25.83320 0.9732779 26.542467 3.137628e-155 23.83259 27.83380 4 90 21
# 3 3 response 24.70544 1.2102719 20.413129 1.278321e-92 22.21769 27.19319 4 100 21
# 4 4 response 23.57768 1.5494525 15.216780 2.736718e-52 20.39273 26.76262 4 110 21
# 5 5 response 20.06479 2.4401222 8.222863 1.987022e-16 15.04905 25.08053 6 80 21
# 6 6 response 19.98866 2.0043577 9.972599 2.009006e-23 15.86864 24.10867 6 90 21
# 7 7 response 19.91252 1.6124942 12.348897 4.938402e-35 16.59799 23.22705 6 100 21
# 8 8 response 19.83639 1.3047065 15.203720 3.340987e-52 17.15453 22.51825 6 110 21
# 9 9 response 16.94054 2.2777361 7.437448 1.026490e-13 12.25859 21.62250 8 80 21
# 10 10 response 16.79810 2.1245264 7.906752 2.641918e-15 12.43108 21.16513 8 90 21
# 11 11 response 16.65566 1.9731806 8.441022 3.145744e-17 12.59973 20.71159 8 100 21
# 12 12 response 16.51322 1.8241629 9.052492 1.397416e-19 12.76360 20.26284 8 110 21
Given the pred
dataframe, how can one estimate
First difference. The difference between each contrasting value, for instance between
cyl == 8
andcyl == 4
for each of thehp
values, with a similar output than inpred
(e.g. with p.value, std.error...)?Second difference. The difference between those differences, i.e. for each of the differences computed in the first step, the difference between them as well as the standard errors.
I am looking for a method that could hopefully work also for logistic regression.
CodePudding user response:
First differences
The comparisons()
function gives first differences. You will find a very detailed tutorial in this vignette: https://vincentarelbundock.github.io/marginaleffects/articles/contrasts.html
library(marginaleffects)
mod <- lm(mpg ~ hp * factor(cyl), data = mtcars)
comparisons(
mod,
variables = "cyl",
by = "hp",
newdata = datagrid(cyl = unique, hp = seq(80, 110, 10))) |>
summary()
#> Term Contrast hp Effect Std. Error z value Pr(>|z|) 2.5 %
#> 1 cyl mean(6) - mean(4) 80 -6.896 2.608 -2.644 0.00819167 -12.008
#> 2 cyl mean(6) - mean(4) 90 -5.845 2.228 -2.623 0.00871522 -10.212
#> 3 cyl mean(6) - mean(4) 100 -4.793 2.016 -2.377 0.01744227 -8.745
#> 4 cyl mean(6) - mean(4) 110 -3.741 2.026 -1.847 0.06474713 -7.711
#> 5 cyl mean(8) - mean(4) 80 -10.020 2.457 -4.078 4.5336e-05 -14.836
#> 6 cyl mean(8) - mean(4) 90 -9.035 2.337 -3.866 0.00011048 -13.615
#> 7 cyl mean(8) - mean(4) 100 -8.050 2.315 -3.478 0.00050600 -12.587
#> 8 cyl mean(8) - mean(4) 110 -7.064 2.393 -2.952 0.00316093 -11.755
#>
#> Model type: lm
#> Prediction type: response
Second differences
You can take differences in differences using the hypothesis
argument. You will find a very detailed tutorial in this vignette: https://vincentarelbundock.github.io/marginaleffects/articles/hypothesis.html
comparisons(
mod,
variables = "cyl",
by = "hp",
hypothesis = "pairwise",
newdata = datagrid(cyl = unique, hp = seq(80, 110, 10))) |>
summary()
#> Term Effect Std. Error z value
#> 1 mean(6) - mean(4), 80 - mean(6) - mean(4), 90 -1.0516 0.6848 -1.53560
#> 2 mean(6) - mean(4), 80 - mean(6) - mean(4),100 -2.1033 1.3697 -1.53560
#> 3 mean(6) - mean(4), 80 - mean(6) - mean(4),110 -3.1549 2.0545 -1.53560
#> 4 mean(6) - mean(4), 80 - mean(8) - mean(4), 80 3.1242 3.3380 0.93596
#> 5 mean(6) - mean(4), 80 - mean(8) - mean(4), 90 2.1389 3.2676 0.65459
#> 6 mean(6) - mean(4), 80 - mean(8) - mean(4),100 1.1536 3.2688 0.35292
#> 7 mean(6) - mean(4), 80 - mean(8) - mean(4),110 0.1683 3.3414 0.05037
#> 8 mean(6) - mean(4), 90 - mean(6) - mean(4),100 -1.0516 0.6848 -1.53560
#> 9 mean(6) - mean(4), 90 - mean(6) - mean(4),110 -2.1033 1.3697 -1.53560
#> 10 mean(6) - mean(4), 90 - mean(8) - mean(4), 80 4.1759 3.0684 1.36095
#> 11 mean(6) - mean(4), 90 - mean(8) - mean(4), 90 3.1906 2.9208 1.09236
#> 12 mean(6) - mean(4), 90 - mean(8) - mean(4),100 2.2052 2.8496 0.77388
#> 13 mean(6) - mean(4), 90 - mean(8) - mean(4),110 1.2199 2.8604 0.42648
#> 14 mean(6) - mean(4),100 - mean(6) - mean(4),110 -1.0516 0.6848 -1.53560
#> 15 mean(6) - mean(4),100 - mean(8) - mean(4), 80 5.2275 2.9369 1.77994
#> 16 mean(6) - mean(4),100 - mean(8) - mean(4), 90 4.2422 2.7061 1.56763
#> 17 mean(6) - mean(4),100 - mean(8) - mean(4),100 3.2569 2.5483 1.27808
#> 18 mean(6) - mean(4),100 - mean(8) - mean(4),110 2.2715 2.4773 0.91695
#> 19 mean(6) - mean(4),110 - mean(8) - mean(4), 80 6.2791 2.9621 2.11984
#> 20 mean(6) - mean(4),110 - mean(8) - mean(4), 90 5.2938 2.6557 1.99334
#> 21 mean(6) - mean(4),110 - mean(8) - mean(4),100 4.3085 2.4094 1.78824
#> 22 mean(6) - mean(4),110 - mean(8) - mean(4),110 3.3232 2.2427 1.48175
#> 23 mean(8) - mean(4), 80 - mean(8) - mean(4), 90 -0.9853 0.4862 -2.02641
#> 24 mean(8) - mean(4), 80 - mean(8) - mean(4),100 -1.9706 0.9725 -2.02641
#> 25 mean(8) - mean(4), 80 - mean(8) - mean(4),110 -2.9560 1.4587 -2.02641
#> 26 mean(8) - mean(4), 90 - mean(8) - mean(4),100 -0.9853 0.4862 -2.02641
#> 27 mean(8) - mean(4), 90 - mean(8) - mean(4),110 -1.9706 0.9725 -2.02641
#> 28 mean(8) - mean(4),100 - mean(8) - mean(4),110 -0.9853 0.4862 -2.02641
#>
#> Model type: lm
#> Prediction type: response