I'm trying to save the summary of a model as a data frame in R. The model is a stepwise regression model using the MASS package. I'm primarily interested in saving the coefficients, their t value and the R-squared of the model.
I tried
ModelSummary <- data.frame(unclass(summary(step.model)), check.names = FALSE, stringsAsFactors = FALSE)
But had the error
Error in as.data.frame.default(x[[i]], optional = TRUE, stringsAsFactors = stringsAsFactors) :
cannot coerce class ‘"lm"’ to a data.frame
CodePudding user response:
Once you have run the model, you can access the coefficients and r-squared using $
syntax: $coefficients
and $r.squared
. Then you can cbind()
to combine these.
So, for sample data:
model.cars <- lm(mtcars)
summary.cars <- summary(model.cars)
want <- cbind(summary.cars$coefficients, rsq=summary.cars$r.squared)
CodePudding user response:
Quick Intro
The broom
package has a lot of great ways to turn your regression summaries into data frames. Since you do not have a directly reproducible dataset included, I will just use a simple regression on the iris
dataset as an example. First, you can load the packages broom
for the tidy dataframes and tidyverse
for wrangling/plotting the data.
#### Load Libraries ####
library(tidyverse)
library(broom)
Then fit a regression like so:
#### Fit Regression Model ####
fit <- lm(Petal.Length ~ Petal.Width,
iris)
Broom Functions
The tidy
function turns your main coefficients into a dataframe instantly.
#### Tidy Dataframe ####
fit.tidy <- tidy(fit)
fit.tidy
Like so:
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1.08 0.0730 14.8 4.04e-31
2 Petal.Width 2.23 0.0514 43.4 4.68e-86
The augment
function fits your model data and several other useful metrics into a dataframe, such as fitted values, residuals, and other information.
#### Augmented Dataframe ####
fit.aug <- augment(fit)
fit.aug
Like so:
# A tibble: 150 × 8
Petal.Length Petal.Width .fitted .resid .hat .sigma .cooksd .std.resid
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 1.4 0.2 1.53 -0.130 0.0182 0.480 0.000693 -0.273
2 1.4 0.2 1.53 -0.130 0.0182 0.480 0.000693 -0.273
3 1.3 0.2 1.53 -0.230 0.0182 0.479 0.00218 -0.484
4 1.5 0.2 1.53 -0.0295 0.0182 0.480 0.0000360 -0.0624
5 1.4 0.2 1.53 -0.130 0.0182 0.480 0.000693 -0.273
6 1.7 0.4 1.98 -0.276 0.0140 0.479 0.00240 -0.580
7 1.4 0.3 1.75 -0.353 0.0160 0.479 0.00449 -0.743
8 1.5 0.2 1.53 -0.0295 0.0182 0.480 0.0000360 -0.0624
9 1.4 0.2 1.53 -0.130 0.0182 0.480 0.000693 -0.273
10 1.5 0.1 1.31 0.193 0.0206 0.480 0.00176 0.409
To glance your final model fit metrics, glance
checks things like adjusted R square, etc. and places them into a data frame.
#### Glance Dataframe ####
fit.glance <- glance(fit)
fit.glance
Like so:
# A tibble: 1 × 12
r.squared adj.r.squ…¹ sigma stati…² p.value df logLik AIC BIC devia…³
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 0.927 0.927 0.478 1882. 4.68e-86 1 -101. 208. 217. 33.8
These data frame variables can also be very quickly selected using pull
from the tidyverse
(specifically dplyr
).
fit.glance %>%
pull(adj.r.squared)
Giving you a quick adjusted R square value:
[1] 0.9266173
Usage Example
I will show you how this can be used for one case with the augment
code I just used. Here is a plot of the fitted values and residuals on a scatterplot using the model data frame:
fit.aug %>%
ggplot(aes(x=.fitted,
y=.resid))
geom_point()
Since you are interested in combining some of these values in a data frame, you can also do this by adding bits of each into each other.
fit.tidy %>%
add_column(r.square = fit.glance$r.squared,
adj.r.square = fit.glance$adj.r.squared)
Like so:
# A tibble: 2 × 7
term estimate std.error statistic p.value r.square adj.r.square
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 1.08 0.0730 14.8 4.04e-31 0.927 0.927
2 Petal.Width 2.23 0.0514 43.4 4.68e-86 0.927 0.927