I've tried fitting a linear equation to data with a logarithmic y axis a couple different ways. I seem to get the correct line and equation when I use stat_poly_line and stat_poly_eq. (See example with mtcars)
library(tidyverse)
library(ggpmisc)
mtcars <- mtcars %>%
mutate(mpg10 = 10^mpg) #add column to have something work on logarithmic scale
ggplot(mtcars, aes(x = wt, y = mpg10))
geom_point()
scale_y_continuous(trans='log10') #make y axis logarithmic
stat_poly_line(fullrange = TRUE, se = FALSE)
stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label.., sep = "~~~~~")),
parse=TRUE,label.x.npc = "right")
But when I use lm() to fit the line I get a different slope and intercept. Even though the line drawn on the plot looks identical to that drawn with stat_poly_line. Why is the equation for the line here incorrect? Am I calling the wrong values? or using lm() incorrectly?
model <- lm(log(mtcars$mpg10) ~ mtcars$wt)
ggplot(mtcars, aes(x = wt, y = mpg10))
geom_point()
scale_y_continuous(trans='log10')
stat_smooth(method = "lm", fullrange = TRUE, se = FALSE)
labs(title = paste("R^2 = ",signif(summary(model)$adj.r.squared, 2),
" ",
"y=",signif(model$coefficients[[1]],3 ), signif(model$coefficients[[2]], 3), "x"))
Ultimately, I'd like to be able to use nlme::lmList to record the slope and intercept of many separate lines but I can only do this if I am implementing lm correctly. It currently only seems to work for me if I don't have the logarithmic y axis.
CodePudding user response:
The regression equations are different because in the first version, stat_poly_eq
is using log to the base 10. In the second version you used log(mtcars$mpg10)
in your model, which is the natural log. You need to use log10
to bring it in line with the other version:
model <- lm(log10(mtcars$mpg10) ~ mtcars$wt)
ggplot(mtcars, aes(x = wt, y = mpg10))
geom_point()
scale_y_continuous(trans='log10')
stat_smooth(method = "lm", fullrange = TRUE, se = FALSE)
labs(title = paste("R^2 = ",signif(summary(model)$adj.r.squared, 2),
" ",
"y=",signif(model$coefficients[[1]],3 ),
signif(model$coefficients[[2]], 3), "x"))