I can make the following plot very easily by computing the group means using dplyr
, but is there way to do it entirely within ggplot2
without preprocessing the data, using stat_<something>
?
library(tidyverse)
iris |>
group_by(Species) |>
summarise(
Sepal.Length = mean(Sepal.Length),
Sepal.Width = mean(Sepal.Width)
) |> ggplot()
geom_point(aes(x = Sepal.Length, y = Sepal.Width, color = Species))
stat_summary
seems to summarize only at identical x
or y
values, and stat_bin
doesn't work across discrete variables, but is there another stat_*
for this? I've found stat_centroid
from ggh4x
but I'm looking for something built-in.
Edit: to be clear about my goals, I'm looking to avoid the duplication of the x/y/color column names if possible!
CodePudding user response:
The closest you can get, I think, is to embed the aggregation inside the stat_summary
call by using a function as the data
An alternative would be to pass a little summary data frame to the points layer using summarize_all
ggplot(iris, aes(Sepal.Length, Sepal.Width, color = Species))
geom_point(data = summarize_all(group_by(iris, Species), mean))