Home > Software engineering >  Certain iterations in linear regression analysis - loop? in R
Certain iterations in linear regression analysis - loop? in R

Time:10-26

I have a linear regression with a hundred values. Now I want to know how the coefficient was only at a certain sub-range of values. For example all 10 values.

Result should be similar to this:

Coefficient from 1-10: 0.5

Coefficient from 11-20: 0.33

Coefficient from 21-30: 0.306

......

I need a reproducible solution as the truth is I have a much larger data set.

Example:

set.seed(111)
a <- rnorm(100) 
b <- rnorm(100)
abc <- lm(a ~ b)
summary(abc)

Thanks in advance!

CodePudding user response:

You can use lmList:

DF <- data.frame(a, b)
DF$g <- rep(1:10, each = 10) #grouping variable

library(nlme)
fit <- lmList(a ~ b | g, data = DF, pool = FALSE)
coef(fit)
#   (Intercept)           b
#1  -0.67657906 -0.13354482
#2  -0.04171987 -0.14376230
#3   0.21816989 -0.14235641
#4  -0.86485164 -0.62314870
#5   0.26063798  0.10143534
#6   0.46665016 -0.08049576
#7   0.73507428 -0.54861970
#8   0.18782393  0.46275608
#9   0.02541912  0.57539731
#10  0.11944852  0.89788608

summary(fit)

CodePudding user response:

Using dplyr, group by sub-range, then use group_map() to get coefficients for each sub-range:

library(dplyr)

tibble(a, b) %>%
  group_by(g = rep(1:10, each = 10)) %>%
  group_map(~ coef(lm(a ~ b, data = .x))) %>%
  bind_rows()

Output:

# A tibble: 10 × 2
   `(Intercept)`       b
           <dbl>   <dbl>
 1       -0.677  -0.134 
 2       -0.0417 -0.144 
 3        0.218  -0.142 
 4       -0.865  -0.623 
 5        0.261   0.101 
 6        0.467  -0.0805
 7        0.735  -0.549 
 8        0.188   0.463 
 9        0.0254  0.575 
10        0.119   0.898 
  • Related