I am going through an R example of using interaction terms in a fixed effect model. The example can be found here.
The example uses the fixest
package and uses the syntax var::fe(ref)
. I don't understand what ref
is and what it does here. How do I select the value for ref
?
I have come across this explanation on Google: "You can interact a numeric variable with a "factor-like" variable by using i(factor_var, continuous_var, ref)
, where continuous_var
will be interacted with each value of factor_var
and the argument ref
is a value of factor_var
taken as a reference (optional)." - I do not understand the role of this "reference" here.
Any insight will be highly appreciated.
CodePudding user response:
When you estimate a model with a categorical predictors entered as a series of dummy variables or, equivalent, a fixed effects models, you must always omit one of the dummies to avoid perfect collinearity. The dummy you omit is the “reference category”.
The choice of reference category is arbitrary, it does not change the predictions of the model, but it does affect how you interpret the coefficients of the remaining dummy variables. This is well-known, and in most regression intro textbooks.
In fixest
, you can use the ref
argument of the i()
function to determine which category will be omitted. Below, you will see that the drat
coefficient stays exactly the same, but that the other coefficients change because the reference category changes:
library(fixest)
library(modelsummary)
mod1 <- lm(mpg ~ drat factor(cyl) * hp, data = mtcars)
mod2 <- feols(mpg ~ drat hp * i(cyl), data = mtcars)
#> The variable 'hp:cyl::8' has been removed because of collinearity (see $collin.var).
mod3 <- feols(mpg ~ drat hp * i(cyl, ref = 8), data = mtcars)
models <- list(mod1, mod2, mod3)
modelsummary(models, fmt = 6)
Model 1 | Model 2 | Model 3 | |
---|---|---|---|
(Intercept) | 26.771696 | 26.771696 | 13.796313 |
(8.719507) | (8.719507) | (5.057123) | |
drat | 1.939525 | 1.939525 | 1.939525 |
(1.646230) | (1.646230) | (1.646230) | |
factor(cyl)6 | -12.041741 | ||
(7.883606) | |||
factor(cyl)8 | -12.975383 | ||
(6.689497) | |||
hp | -0.096854 | -0.023706 | -0.023706 |
(0.047378) | (0.018221) | (0.018221) | |
factor(cyl)6 × hp | 0.080976 | ||
(0.071010) | |||
factor(cyl)8 × hp | 0.073149 | ||
(0.052855) | |||
cyl = 6 | -12.041741 | 0.933642 | |
(7.883606) | (7.341465) | ||
cyl = 8 | -12.975383 | ||
(6.689497) | |||
hp × cyl = 4 | -0.073149 | -0.073149 | |
(0.052855) | (0.052855) | ||
hp × cyl = 6 | 0.007828 | 0.007828 | |
(0.053174) | (0.053174) | ||
cyl = 4 | 12.975383 | ||
(6.689497) | |||
Num.Obs. | 32 | 32 | 32 |
R2 | 0.799 | 0.799 | 0.799 |
R2 Adj. | 0.751 | 0.751 | 0.751 |
AIC | 169.4 | 169.4 | 169.4 |
BIC | 181.1 | 181.1 | 181.1 |
Log.Lik. | -76.677 | ||
F | 16.601 | ||
RMSE | 2.66 | 2.66 | 2.66 |
Std.Errors | IID | IID |