Home > Software engineering >  What happens with ggplot (geom_smooth(method = "loess) when span > 1?
What happens with ggplot (geom_smooth(method = "loess) when span > 1?

Time:01-12

Imagine a ggplot that uses geom_smooth(method = "loess", span = 1.5). In that ggplot, what is the span argument telling ggplot to do to the geom_smooth lines (or the associated math)?

I have read the previous discussions on the function of the span argument (and the related discussion of alpha), but they have discussed it as being bounded by 0 and 1. This is not true, and span = 1.5 (for example) creates a different smoothness than span = 1.

CodePudding user response:

The loess smoothing method uses stats::loess, and the help for that function gives more context about how the span parameter works when it's greater than one.

?loess

span
the parameter α which controls the degree of smoothing.

Details
Fitting is done locally. That is, for the fit at point x, the fit is made using points in a neighbourhood of xx, weighted by their distance from xx (with differences in ‘parametric’ variables being ignored when computing the distance). The size of the neighbourhood is controlled by α (set by span or enp.target). For α<1, the neighbourhood includes proportion α of the points, and these have tricubic weighting (proportional to (1 - (dist/maxdist)^3)^3). For α>1, all points are used, with the ‘maximum distance’ assumed to be α ^ (1/p) times the actual maximum distance for p explanatory variables.

In other words, when span is > 1 all the points are included, and the larger it gets, the less "local" the weighting gets.

  • Related