Home > Software design >  Is it possible to extract slope and intercept from multiple fitted lines into a tibble?
Is it possible to extract slope and intercept from multiple fitted lines into a tibble?

Time:06-17

I'm trying to compare the slope and intercept of many separate fitted lines and would like to extract this information from the equations that are shown using stat_poly_eq. I am able to plot all of the data and lines but since in my actual data I has over 50 equations, I'd like a simple way to extract the slope and intercept of each line.

Below is code to generate a similar plot with mtcars.

I'd like to add an output as a tibble with columns for cyl, gear, slope, intercept.

library(tidyverse)
library(ggpmisc)

ggplot(mtcars, aes(x = wt, y = mpg, color = as.character(cyl)))  
  geom_point() 
  facet_wrap(gear ~ .)  
  stat_poly_line(fullrange = TRUE, se = FALSE)  
  stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~~")), 
               parse=TRUE,label.x.npc = "right")

CodePudding user response:

You can use nlme::lmList to help run all regression.

library(nlme)

#create a new grouping variable for the nested grouping factor
mtcars2 <- mtcars %>% mutate(gear_cyl = interaction(mtcars$gear, mtcars$cyl))

fm1 <- lmList(mpg ~ wt | gear_cyl, mtcars2)

Calling the summary for coefficient and r-squared will return you the same statistics shown on your ggplot.

summary(fm1)$coef

, , (Intercept)

    Estimate Std. Error  t value     Pr(>|t|)
3.4 21.50000        NaN      NaN          NaN
4.4 40.85910   4.254923 9.602782 4.813403e-08
5.4 41.01754        NaN      NaN          NaN
3.6 64.70408        NaN      NaN          NaN
4.6 30.20964  12.042632 2.508558 2.326993e-02
5.6 19.70000        NaN      NaN          NaN
3.8 25.05942   4.527584 5.534834 4.526681e-05
5.8 22.14000        NaN      NaN          NaN

, , wt

      Estimate Std. Error    t value    Pr(>|t|)
3.4         NA        NaN         NA          NA
4.4  -5.859279   1.741259 -3.3649675 0.003940916
5.4  -7.017544        NaN        NaN         NaN
3.6 -13.469388        NaN        NaN         NaN
4.6  -3.380894   3.866794 -0.8743402 0.394867930
5.6         NA        NaN         NA          NA
3.8  -2.438894   1.085886 -2.2459951 0.039178704
5.8  -2.000000        NaN        NaN         NaN
summary(fm1)$r.squared

[1] 0.0000000 0.5358963 1.0000000 1.0000000 0.8095674 0.0000000 0.4561613 1.0000000

To put everything into a tibble:

output <- tibble(gear_cyl = names(fm1),
                 intercept = summary(fm1)$coef[,1,1],
                 slope = summary(fm1)$coef[,1,2],
                 r_sq = summary(fm1)$r.squared)

output %>% separate(gear_cyl, c("gear", "cyl"), sep="\\.")

# A tibble: 8 x 5
  gear  cyl   intercept  slope  r_sq
  <chr> <chr>     <dbl>  <dbl> <dbl>
1 3     4          21.5  NA    0    
2 4     4          40.9  -5.86 0.536
3 5     4          41.0  -7.02 1    
4 3     6          64.7 -13.5  1    
5 4     6          30.2  -3.38 0.810
6 5     6          19.7  NA    0    
7 3     8          25.1  -2.44 0.456
8 5     8          22.1  -2.00 1   

  • Related