I am trying to use tidymodels ecosystem to perform econometric analysis. The example I am following at the moment is from the book “Principles of Econometrics with R” by Colonescu. The data from the book can be downloaded through
devtools::install_github("ccolonescue/PoEData")
0.1 The Example
I am creating a wage discrimination model, which has interaction effects as well. The model is as follows
library(tidymodels)
library(PoEdata)#to load the data
library(car)#For linearHypothesis function
Loading required package: carData
lm_model <- linear_reg() %>%
set_engine("lm")#model specification
data("cps4_small")
mod1 <- lm_model %>%
fit(wage~educ black*female, data=cps4_small)#model fitting
0.2 The Issue
After creating the model, I want to test the hypothesis that there is no discrimination on the basis of gender or race. In other words, I need to test the hypothesis that the coefficients of black, female, and black:female are all zero at the same type. I want to use linearHypothesis
function from the car package for this.
hyp <- c("black=0", "female=0", "black:female=0")
tab <- tidy(linearHypothesis(mod1, hyp))
This gives me an error that there is no applicable method for vcov for an object of class _lm or model_fit
.
So, can someone help me how I can generate covariance matrix from a parsnip object?
CodePudding user response:
You need to use the extract_fit_engine()
to get out the underlying lm
fit object from the parsnip
model object.
library(tidymodels)
library(PoEdata)
library(car)
data("cps4_small")
lm_model <- linear_reg() %>%
set_engine("lm")
mod1 <- lm_model %>%
fit(wage ~ educ black * female, data = cps4_small)
hyp <- c("black=0", "female=0", "black:female=0")
mod1 %>%
extract_fit_engine() %>%
linearHypothesis(hyp) %>%
tidy()
#> # A tibble: 2 × 6
#> res.df rss df sumsq statistic p.value
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
#> 1 998 135771. NA NA NA NA
#> 2 995 130195. 3 5576. 14.2 4.53e-9
Created on 2021-11-13 by the reprex package (v2.0.1)