Aren't these both plots of the residuals? If so, then what is the difference?-CodePudding

I'm analysing the Boston dataset from the

plot(predict(lm.Boston), residuals(lm.Boston))

Looking at the points on each plot and the values of the axes, there clearly seems to be some significant difference between these two "residual" plots. However, I'm struggling to understand precisely what the difference is here (besides the obvious fact that one uses the generic plot function and the other uses ggplot). Aren't these both plots of the residuals? If so, then what is the difference? I'm assuming the confusion here stems from the complexity in the ggplot plot, since the plot plot is quite straightforwardly a plot of the residuals vs predicted values.

CodePudding user response：

To answer your question -> no!

The first plot with ggplot shows the data points from which you could "measure" or "inspect" the residuals!
To plot the residuals with ggplot try this

library(ggfortify)
autoplot(lm.Boston)

You should get 4 graphs the first one is relevant for you.
This is similar to what you do with the base R plot -> Here you really plot the residuals.

Here is an example with the mtcars dataset:

lm.mtcars <- lm(formula = cyl ~ mpg, data =mtcars)

mtcars$residuals <- lm.mtcars$residuals
mtcars$predicted <- predict(lm.mtcars)

# plot1 your first ggplot
ggplot(mtcars, aes(x = cyl, y = mpg))  
  geom_smooth(method = "lm", se = FALSE, color = "lightgrey")   # Plot regression slope
  geom_segment(aes(xend = cyl, yend = predicted), alpha = 0.2)   # alpha to fade lines
  geom_point()  
  geom_point(aes(y = predicted), shape = 1)  
  theme_bw() # Add theme for cleaner look

# plot2 your R base plot
par(mfrow = c(1, 1))
plot(mtcars$predicted, mtcars$residuals)

# plot3
library(ggfortify)
autoplot(lm.mtcars, which = 1, ncol = 1)

plot1:

plot2:

plot3: your ggfortify plot