Home > OS >  How can I print the results of a summary and predict function by running a single code chunk using d
How can I print the results of a summary and predict function by running a single code chunk using d

Time:07-22

I am trying to fit several linear models using tidyverse in R. I am interested in printing out the results of the model fit using summary as well as a custom function designed to return statistical parameters not returned by summary like AIC values, and then apply this model to predict values in a set of known data (a test dataset). Here is an example of what I am doing using the mtcars dataset.

library(tidyverse);library(magrittr)
mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg)%>%
  summary()
mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg)%>%
  AIC()
mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg)%>%
  predict(newdata=data.frame(mpg=19))

I am often doing a lot of filtering of my data before calling lm (due to missing data that are not missing for all models, using mutate calls, using summarise, or filtering based on a categorical variable of interest), and fitting many different model permutations. However, I end up having to call the same code multiple times in order to obtain the summary statistics.

Normally I would just save the lm models as an object but in this case I am interested in just running a preliminary test to see what the results look like to see if that version is worth saving, and I don't want large numbers of lm objects cluttering up my global environment. However it seems once a pipe is called after lm it is not possible to call the temporary lm object again.

Is there any tidy way to retain a fitted lm object and fork it in the same string of code such that I can print the results of a summary, predict, and AIC function in a single call?

CodePudding user response:

A magritter pipeline allows for a code block where . is the value coming from the chain. So

mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg)%>% {list(
  summary(.),
  AIC(.),
  predict(., newdata=data.frame(mpg=19))
  )}

Will work You could also kind of use the %T>% (tee) pipe. But you'll need to explicitly print the values or something in the chain if you want to see them

mtcars%>%
  filter(gear=="4")%$%
  lm(hp~mpg) %T>%
  {print(summary(.))} %T>%
  {print(AIC(.))} %>%
  predict(newdata=data.frame(mpg=19))

CodePudding user response:

One option is to make a custom function that produces the desired outputs together. Then you can feed whatever data you like in as a single line.

library(tidyverse)

## function to produce all desired outputs in one object
f <- function(train_data = mtcars,
              x = "mpg",
              y = "hp",
              test_data = data.frame(mpg = 19)) {
    formula <- as.formula(paste0(y, "~", x))
    mod <- lm(formula, data = train_data)
    list(
      summary = summary(mod),
      AIC = AIC(mod),
      prediction = predict(mod, test_data)
    )
  }

f()
#> $summary
#> 
#> Call:
#> lm(formula = formula, data = train_data)
#> 
#> Residuals:
#>    Min     1Q Median     3Q    Max 
#> -59.26 -28.93 -13.45  25.65 143.36 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)   324.08      27.43  11.813 8.25e-13 ***
#> mpg            -8.83       1.31  -6.742 1.79e-07 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 43.95 on 30 degrees of freedom
#> Multiple R-squared:  0.6024, Adjusted R-squared:  0.5892 
#> F-statistic: 45.46 on 1 and 30 DF,  p-value: 1.788e-07
#> 
#> 
#> $AIC
#> [1] 336.8553
#> 
#> $prediction
#>        1 
#> 156.3174

Created on 2022-07-21 by the reprex package (v2.0.1)

  • Related