I am trying to filter and aggregate results from multiple regression models executed on a subset of dataset using dlply.
This is how I ran my models
library(plyr)
data("mtcars")
models = dlply(mtcars, .(cyl), function(df) lm(mpg ~ hp,data=df))
lapply(models, summary)
Right now I am combining the results from different models(cylinder 4, 6, 8) like this
rbind(
c("Cylinder 4", coef(lapply(models, summary)$`4`)[2,]),
c("Cylinder 6", coef(lapply(models, summary)$`6`)[2,]),
c("Cylinder 8", coef(lapply(models, summary)$`8`)[2,])
)
Is there a way to summarize this more efficiently ? Thanks in advance.
CodePudding user response:
We can use tidy
from broom
, rather than using summary
and coef
. We can also just pipe the model data straight into map2_df
.
library(tidyverse)
dlply(mtcars, .(cyl), function(df)
lm(mpg ~ hp, data = df)) %>%
map2_df(
.,
names(.),
~ tidy(.x)[2,] %>% mutate(Cylinder = paste0("Cylinder ", .y)) %>% tibble::column_to_rownames("Cylinder")
)
Output
term estimate std.error statistic p.value Cylinder
<chr> <dbl> <dbl> <dbl> <dbl> <chr>
1 hp -0.113 0.0612 -1.84 0.0984 4
2 hp -0.00761 0.0266 -0.286 0.786 6
3 hp -0.0142 0.0139 -1.02 0.326 8