I have multiple regression lines. I want to combine the coefficients into one dataframe for easy visualization.
However, not all regressions have the same coefficients, so I was not able to use a for loop looking for the coefficient name.
Here is an example with same sample data and the desired output.
df=structure(list(x1 = c(-0.689814979498939, -0.509885025360363,
-0.20248689168896, -1.79535329549682, 1.60447678701814, -0.408696703105769,
0.97243696942363, -0.688339413750959, -0.359380427396309, 1.11638856659614
), x2 = c(0.775426469430265, 0.367906637531888, 0.965721516497862,
-0.601113535090469, -0.655567870650469, 1.45494263752806, 0.187276141272287,
-0.659949502938592, -0.481763339717836, -0.581132345668067),
x3 = c(-0.17202393327554, 0.022376822081548, -1.05069599269781,
-0.631926480864125, 1.76178640615702, -1.60488439781703,
0.172936842119056, 0.750091896988, -1.60900096983098, 0.443223570706679
), x4 = c(-0.117822668731567, -0.645150368596604, -1.58642572549226,
0.3630617077837, -1.00866095836508, 0.696818785571135, 0.978471598076335,
-0.315392158997475, 1.37594860146428, 0.0574562910914235),
y = c(-1.07067139899979, -0.360297366336307, 0.0328023505398295,
1.07908579247402, 0.185603676169661, 0.384858869675533, 0.62179479088495,
1.44265090318836, 0.340526158232088, -1.20387054108186)), class = "data.frame", row.names = c(NA,
-10L))
model1=lm(y~x1, data=df)
model2=lm(y~x2, data=df)
model3=lm(y~x2 x4, data=df)
model4=lm(y~x2 x3 x4, data=df)
coefs_x1=c(-0.2749230,NA,NA,NA)
coefs_x2=c(NA,-0.2795309,-0.2599686,-0.40977455)
coefs_x3=c(NA,NA,NA,-0.18740855)
coefs_x4=c(NA,NA,0.1568399,0.04981574)
output_df=data.frame(coefs_x1,coefs_x2,coefs_x3,coefs_x4)
> output_df
coefs_x1 coefs_x2 coefs_x3 coefs_x4
1 -0.274923 NA NA NA
2 NA -0.2795309 NA NA
3 NA -0.2599686 NA 0.15683990
4 NA -0.4097746 -0.1874086 0.04981574
CodePudding user response:
You could do:
library(tidyverse)
forms <- list(x1 = y~ x1, x2 = y ~ x2, x3 = y ~ x2 x4, x4 = y ~ x2 x3 x4)
map(forms, ~t(coef(lm(.x, data = df)))) %>%
plyr::rbind.fill.matrix() %>%
as.data.frame() %>%
select(-1)
x1 x2 x4 x3
1 -0.274923 NA NA NA
2 NA -0.2795309 NA NA
3 NA -0.2599686 0.15683990 NA
4 NA -0.4097745 0.04981574 -0.1874086
CodePudding user response:
There are many ways to do that, here is what I would typically do using dplyr
.
You can call directly each of the coefficients. They are "inside" the objects named "model." Once you call model1$coeffcients
it will return you the coefficients, including the intercept. Since you don't want the intercept (at least you didn't mention it in your question), I'm removing it using baseR
with the [-1]
argument, that removes the first column.
Then I'm putting all the lines together with bind_rows()
and organize the presentation with select()
. The function bind_rows()
will merge each row, and add the new columns also adding NA
, for missings. Which solves your problem.
Solution
library(dplyr)
bind_rows(model1$coefficients[-1],
model2$coefficients[-1],
model3$coefficients[-1],
model4$coefficients[-1]) %>%
select(x1, x2, x3, x4)
Output
# A tibble: 4 x 4
x1 x2 x3 x4
<dbl> <dbl> <dbl> <dbl>
1 -0.275 NA NA NA
2 NA -0.280 NA NA
3 NA -0.260 NA 0.157
4 NA -0.410 -0.187 0.0498
FYI, the output is the same as yours, but tibble
s usually round it for presentation, but in the background, it has all the decimal places.
CodePudding user response:
Using base
R (list created simply to make it easier to make names, original idea was to rbind
with do.call
)
# assumes coefs will be named coefs_x
coefs <- ls(pattern="coefs_x*")
as.data.frame(coefs, col.names=paste0("coefs_x",1:length(coefs )))
coefs_x1 coefs_x2 coefs_x3 coefs_x4
1 -0.274923 NA NA NA
2 NA -0.2795309 NA NA
3 NA -0.2599686 NA 0.15683990
4 NA -0.4097746 -0.1874086 0.04981574