geom_smooth()
(RED) appears to be more "smooth" when plotted in ggplot2
than if I plot the values of stats::loess
with geom_line()
(BLUE).
Why? And how do you make the geom_line()
like the line produced by geom_smooth()
?
Reprex:
# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498,
0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249,
0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139,
0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295,
0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889,
0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1,
39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7,
23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df",
"tbl", "data.frame"))
# Add manually added loess values
data <- data %>%
mutate(pred_loess = stats::loess(value ~ date_int, method = "loess")$fitted)
# Plot red and blue
ggplot(data,
aes(x = date_int,
y = value))
geom_point()
geom_smooth(colour = "red", size = 1, se = FALSE)
geom_line(aes(y = pred_loess), colour = "blue", size = 1, se = FALSE)
labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")
CodePudding user response:
To manually plot the loess line, make a new dataframe with regularly spaced x-values and use the predict()
function to find the values for the y-variable.
library(dplyr)
library(ggplot2)
# Data
data <- structure(list(date_int = c(0.834136630343671, 0.848910310142498,
0.851948868398994, 0.857082984073764, 0.866093880972339, 0.86955155071249,
0.874895222129086, 0.925660100586756, 0.937709555741827, 0.957355406538139,
0.977525146689019, 0.996070829840738, 0.998428331936295, 0.998428331936295,
0.998480720871752, 0.998795054484493, 0.999161777032691, 0.999528499580889,
0.999895222129086, 1, 1), value = c(51.78, 46.2, 44.01, 41.1,
39.1, 38.19, 42.87, 42.47, 37.22, 41.6, 44.7, 39.7, 23, 28.7,
23, 30.9, 35.4, 35.8, 32.4, 31, 31)), row.names = c(NA, -21L), class = c("tbl_df",
"tbl", "data.frame"))
fit <- stats::loess(value ~ date_int, data = data)
# Make data.frame for loess trend
fit_df <- data.frame(
date_int = seq(min(data$date_int), max(data$date_int), length.out = 500)
)
fit_df$value <- predict(fit, newdata = fit_df)
# Plot red and blue
ggplot(data,
aes(x = date_int,
y = value))
geom_point()
geom_smooth(colour = "red", size = 1, se = FALSE)
geom_line(data = fit_df, colour = "blue", size = 1)
labs(title = "RED (geom_smooth) is smoother\nthan BLUE (geom_line)")
#> `geom_smooth()` using method = 'loess' and formula 'y ~ x'
Created on 2022-04-20 by the reprex package (v0.3.0)
As mentioned in the comments, your previous approach only gave fitted values for the datapoints in your dataframe (and not a sequence along the x-axis).