I have a linear regression with a hundred values. Now I want to know how the coefficient was only at a certain sub-range of values. For example all 10 values.
Result should be similar to this:
Coefficient from 1-10: 0.5
Coefficient from 11-20: 0.33
Coefficient from 21-30: 0.306
......
I need a reproducible solution as the truth is I have a much larger data set.
Example:
set.seed(111)
a <- rnorm(100)
b <- rnorm(100)
abc <- lm(a ~ b)
summary(abc)
Thanks in advance!
CodePudding user response:
You can use lmList
:
DF <- data.frame(a, b)
DF$g <- rep(1:10, each = 10) #grouping variable
library(nlme)
fit <- lmList(a ~ b | g, data = DF, pool = FALSE)
coef(fit)
# (Intercept) b
#1 -0.67657906 -0.13354482
#2 -0.04171987 -0.14376230
#3 0.21816989 -0.14235641
#4 -0.86485164 -0.62314870
#5 0.26063798 0.10143534
#6 0.46665016 -0.08049576
#7 0.73507428 -0.54861970
#8 0.18782393 0.46275608
#9 0.02541912 0.57539731
#10 0.11944852 0.89788608
summary(fit)
CodePudding user response:
Using dplyr, group by sub-range, then use group_map()
to get coefficients for each sub-range:
library(dplyr)
tibble(a, b) %>%
group_by(g = rep(1:10, each = 10)) %>%
group_map(~ coef(lm(a ~ b, data = .x))) %>%
bind_rows()
Output:
# A tibble: 10 × 2
`(Intercept)` b
<dbl> <dbl>
1 -0.677 -0.134
2 -0.0417 -0.144
3 0.218 -0.142
4 -0.865 -0.623
5 0.261 0.101
6 0.467 -0.0805
7 0.735 -0.549
8 0.188 0.463
9 0.0254 0.575
10 0.119 0.898