I'm trying to compare the slope and intercept of many separate fitted lines and would like to extract this information from the equations that are shown using stat_poly_eq. I am able to plot all of the data and lines but since in my actual data I has over 50 equations, I'd like a simple way to extract the slope and intercept of each line.
Below is code to generate a similar plot with mtcars.
I'd like to add an output as a tibble with columns for cyl, gear, slope, intercept.
library(tidyverse)
library(ggpmisc)
ggplot(mtcars, aes(x = wt, y = mpg, color = as.character(cyl)))
geom_point()
facet_wrap(gear ~ .)
stat_poly_line(fullrange = TRUE, se = FALSE)
stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~~")),
parse=TRUE,label.x.npc = "right")
CodePudding user response:
You can use nlme::lmList
to help run all regression.
library(nlme)
#create a new grouping variable for the nested grouping factor
mtcars2 <- mtcars %>% mutate(gear_cyl = interaction(mtcars$gear, mtcars$cyl))
fm1 <- lmList(mpg ~ wt | gear_cyl, mtcars2)
Calling the summary for coefficient and r-squared will return you the same statistics shown on your ggplot.
summary(fm1)$coef
, , (Intercept)
Estimate Std. Error t value Pr(>|t|)
3.4 21.50000 NaN NaN NaN
4.4 40.85910 4.254923 9.602782 4.813403e-08
5.4 41.01754 NaN NaN NaN
3.6 64.70408 NaN NaN NaN
4.6 30.20964 12.042632 2.508558 2.326993e-02
5.6 19.70000 NaN NaN NaN
3.8 25.05942 4.527584 5.534834 4.526681e-05
5.8 22.14000 NaN NaN NaN
, , wt
Estimate Std. Error t value Pr(>|t|)
3.4 NA NaN NA NA
4.4 -5.859279 1.741259 -3.3649675 0.003940916
5.4 -7.017544 NaN NaN NaN
3.6 -13.469388 NaN NaN NaN
4.6 -3.380894 3.866794 -0.8743402 0.394867930
5.6 NA NaN NA NA
3.8 -2.438894 1.085886 -2.2459951 0.039178704
5.8 -2.000000 NaN NaN NaN
summary(fm1)$r.squared
[1] 0.0000000 0.5358963 1.0000000 1.0000000 0.8095674 0.0000000 0.4561613 1.0000000
To put everything into a tibble:
output <- tibble(gear_cyl = names(fm1),
intercept = summary(fm1)$coef[,1,1],
slope = summary(fm1)$coef[,1,2],
r_sq = summary(fm1)$r.squared)
output %>% separate(gear_cyl, c("gear", "cyl"), sep="\\.")
# A tibble: 8 x 5
gear cyl intercept slope r_sq
<chr> <chr> <dbl> <dbl> <dbl>
1 3 4 21.5 NA 0
2 4 4 40.9 -5.86 0.536
3 5 4 41.0 -7.02 1
4 3 6 64.7 -13.5 1
5 4 6 30.2 -3.38 0.810
6 5 6 19.7 NA 0
7 3 8 25.1 -2.44 0.456
8 5 8 22.1 -2.00 1