I have the following multinomial logit regression
FLAG ~ Cash Debt Other variables
Where FLAG is a dummy with three levels, that distinguishes three different types of firms, while Cash and Debt are some variables regarding those firms.
The results of the regression are the following
Variables Type 0 vs Type 1 Type 0 vs Type 2
Cash 0.543*** 0.321***
Debt -0.124*** 0.452***
I now want to test whether the difference between these coefficients is statistically significant (in this case, between 0.543 and 0.321, and between -0.124 and 0.452)
I know I have to use the Wald test (as it was used in some papers I looked at, that performed the same analysis), but I don't know how to implement it on R.
What's the R code to implement a Wald test of the difference between the coefficients of the model?
CodePudding user response:
Here's an example using the Chile
data from the carData
package.
data(Chile, package="carData")
mod <- nnet::multinom(vote ~ sex I(income/10000), data=Chile)
#> # weights: 16 (9 variable)
#> initial value 3395.034890
#> iter 10 value 3105.730891
#> final value 3043.519706
#> converged
summary(mod)
#> Call:
#> nnet::multinom(formula = vote ~ sex I(income/10000), data = Chile)
#>
#> Coefficients:
#> (Intercept) sexM I(income/10000)
#> N 1.212092 0.5676465119 0.02101608
#> U 1.430206 -0.2235933346 -0.06761914
#> Y 1.477171 -0.0003631522 0.02018734
#>
#> Std. Errors:
#> (Intercept) sexM I(income/10000)
#> N 0.1326298 0.1655421 0.02108768
#> U 0.1352691 0.1738994 0.02478598
#> Y 0.1300724 0.1657089 0.02108419
#>
#> Residual Deviance: 6087.039
#> AIC: 6105.039
Let's say that in the model above, you wanted to compare the coefficients for sexM
for the categories N
and U
. You could save the variance-covariance matrix of the estimators in v
and the coefficients themselves in a vector called b
.
v <- vcov(mod)
b <- c(t(coef(mod)))
names(b) <- rownames(v)
Then, you could create the relevant z-statistic by dividing the difference in coefficients by its standard error (the square root of the sum of the two variances minus two times the covariance of the relevant parameters):
z_wt <- (b["N:sexM"] - b["U:sexM"])/sqrt(v["N:sexM", "N:sexM"] v["U:sexM", "U:sexM"] - 2*v["N:sexM", "U:sexM"])
Then, you could use pnorm()
to calculate the p-value.
2*pnorm(abs(z_wt), lower.tail=FALSE)
#> N:sexM
#> 1.255402e-12
Another option would be to just change the reference category. For example if I wanted the U
-N
comparison, I could change the reference to either U
or N
and re-estimate the model:
Chile$vote2 <- relevel(Chile$vote, "U")
mod2 <- nnet::multinom(vote2 ~ sex I(income/10000), data=Chile, maxit=1000)
#> # weights: 16 (9 variable)
#> initial value 3395.034890
#> iter 10 value 3081.860210
#> final value 3043.519705
#> converged
z_wt2 <- coef(mod2)["N", "sexM"]/summary(mod2)$standard.errors["N", "sexM"]
2*pnorm(abs(z_wt2), lower.tail=FALSE)
#> [1] 1.256475e-12
Note that the two p-values are very close (though not exactly the same) as are the differences in the coefficients from the two methods, which are not printed.
I wrote a package called factorplot
which will calculate all of the pairwise differences for a single variable's coefficients in the MNL model:
library(factorplot)
f <- factorplot(mod, variable="sexM")
print(f)
#> Difference SE p.val
#> A - N -0.568 0.166 0.001
#> A - U 0.224 0.174 0.199
#> N - U 0.791 0.111 0.000
#> A - Y 0.000 0.166 0.998
#> N - Y 0.568 0.098 0.000
#> U - Y -0.223 0.112 0.046
Created on 2022-09-29 by the reprex package (v2.0.1)
This method is a bit easier and is based on the same calculations I proposed in the first method above.