db = tibble(a = rnorm(100), b = rnorm(100), c = rnorm(100))
If I want a tidy multivariate linear regression, I just can go
lm(data = db, 0 a ~ b c) %>% tidy()
But if I want multiple univariate regressions I would go
lm(data = db, a ~ 0 b) %>% tidy() %>%
add_row(lm(data = db, a ~ 0 c) %>% tidy())
Now, given many regressor columns, I would like to avoid to code every single regressor as a new add_row
, how should I make the code more synthetic?
This has a partial solution here:
Tidy output from many single-variable models using purrr, broom
I think the code can be even more lean than in the example?
CodePudding user response:
We could use {}
to block the multiple expressions
library(magrittr)
library(broom)
lm(data = db, a ~ 0 b) %>%
tidy() %>%
{add_row(., lm(data = db, a ~ 0 c) %>%
tidy())}
-output
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 b 0.0601 0.0907 0.663 0.509
2 c 0.0411 0.0899 0.457 0.649
Or may do this within summarise
and unnest
library(tidyr)
db %>%
summarise(out1 = list(bind_rows(lm(a ~ 0 b) %>% tidy,
lm(a~ 0 c) %>% tidy))) %>%
unnest(out1)
-output
# A tibble: 2 × 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 b 0.0601 0.0907 0.663 0.509
2 c 0.0411 0.0899 0.457 0.649
CodePudding user response:
My answer
db %>%
select(-a) %>%
names() %>%
paste('a~0 ',.)%>%
map_df(~tidy(lm(as.formula(.x),
data= db,
)))
CodePudding user response:
You could do something like this: Depending your columns:
library(broom)
vars <- names(db)[-1]
models <- list()
for (i in 1:2){
vc <- combn(vars,i)
for (j in 1:ncol(vc)){
model <- as.formula(paste0("a ~", paste0(vc[,j], collapse = " ")))
models <- c(models, model)
}
}
lapply(models, function(x) lm(x, data = db) %>% tidy())
[[1]]
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.0155 0.0856 0.181 0.857
2 b -0.0502 0.0797 -0.630 0.530
[[2]]
# A tibble: 2 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.0113 0.0856 0.132 0.896
2 c 0.0553 0.0865 0.640 0.524
[[3]]
# A tibble: 3 x 5
term estimate std.error statistic p.value
<chr> <dbl> <dbl> <dbl> <dbl>
1 (Intercept) 0.0132 0.0860 0.153 0.878
2 b -0.0439 0.0807 -0.544 0.588
3 c 0.0486 0.0877 0.555 0.580