Home > Back-end >  How can we extract covariance matrix from a parsnip object?
How can we extract covariance matrix from a parsnip object?

Time:11-14

I am trying to use tidymodels ecosystem to perform econometric analysis. The example I am following at the moment is from the book “Principles of Econometrics with R” by Colonescu. The data from the book can be downloaded through

devtools::install_github("ccolonescue/PoEData")

0.1 The Example

I am creating a wage discrimination model, which has interaction effects as well. The model is as follows

library(tidymodels)
library(PoEdata)#to load the data
library(car)#For linearHypothesis function

Loading required package: carData

lm_model <- linear_reg() %>% 
     set_engine("lm")#model specification
data("cps4_small")
mod1 <- lm_model %>% 
     fit(wage~educ black*female, data=cps4_small)#model fitting

0.2 The Issue

After creating the model, I want to test the hypothesis that there is no discrimination on the basis of gender or race. In other words, I need to test the hypothesis that the coefficients of black, female, and black:female are all zero at the same type. I want to use linearHypothesis function from the car package for this.

hyp <- c("black=0", "female=0", "black:female=0")
tab <- tidy(linearHypothesis(mod1, hyp))

This gives me an error that there is no applicable method for vcov for an object of class _lm or model_fit.

So, can someone help me how I can generate covariance matrix from a parsnip object?

CodePudding user response:

You need to use the extract_fit_engine() to get out the underlying lm fit object from the parsnip model object.

library(tidymodels)
library(PoEdata)
library(car)

data("cps4_small")

lm_model <- linear_reg() %>% 
  set_engine("lm")

mod1 <- lm_model %>% 
  fit(wage ~ educ   black * female, data = cps4_small)

hyp <- c("black=0", "female=0", "black:female=0")

mod1 %>%
  extract_fit_engine() %>%
  linearHypothesis(hyp) %>%
  tidy()
#> # A tibble: 2 × 6
#>   res.df     rss    df sumsq statistic  p.value
#>    <dbl>   <dbl> <dbl> <dbl>     <dbl>    <dbl>
#> 1    998 135771.    NA   NA       NA   NA      
#> 2    995 130195.     3 5576.      14.2  4.53e-9

Created on 2021-11-13 by the reprex package (v2.0.1)

  • Related