I am a fairly experienced ggplot2
user and teach it to university students. However, I only just came across an example that uses the following syntax:
ggplot(mtcars) aes(cyl) geom_histogram()
This fits a lot better into the logic of adding up layers than specifying aes
inside ggplot()
or the geom_
... but it does not seem to be documented anywhere in the ggplot2 help. Therefore, I am wondering whether there are any reasons why this syntax is limited / should not be used? (Obviously, I see that it needs to be specified in the geom if it is meant to differ between geoms ...)
CodePudding user response:
This is verging on an opinion-based question, but I think it is on-topic, since it helps to clarify the syntax and structure of ggplot calls.
In a sense you have already answered the question yourself:
it does not seem to be documented anywhere in the ggplot2 help
This, and the near absence of examples in online tutorials, blogs and SO answers is a good enough reason not to use aes
this way (or at least not to teach people to use it this way). It could lead to confusion and frustration on the part of new users.
This fits a lot better into the logic of adding up layers
This is sort of true, but could be a bit misleading. What it actually does is to specify the default aesthetic mapping, that subsequent layers will inherit from the ggplot
object itself. It should be considered a core part of the base plot, along with the default data object, and therefore "belongs" in the initial ggplot
call, rather than something that is being added or layered on to the plot. If you create a default ggplot
object without data and mapping, the slots are still there, but contain waivers rather than being NULL
:
p <- ggplot()
p$mapping
#> Aesthetic mapping:
#> <empty>
p$data
#> list()
#> attr(,"class")
#> [1] "waiver"
Note that unlike the scales and co-ordinate objects, for which you might argue that the same is also true, there can be no defaults for data and aesthetic mappings.
Does this mean you should never use this syntax? No, but it should be considered an advanced trick for folks who are well versed in ggplot. The most frequent use case I find for it is in changing the mapping of ggplots that are created in extension packages, such as ggsurvplot
or ggraph
, where the plotting functions use wrappers around ggplot
. It can also be used to quickly create multiple plots with the same themes and colour scales:
p <- ggplot(iris, aes(Sepal.Width, Sepal.Length))
geom_point(aes(color = Species))
theme_light()
library(patchwork)
p (p aes(Petal.Width, Petal.Length))
So the bottom line is that you can use this if you want, but best avoid teaching it to beginners
CodePudding user response:
TL;DR
I cannot see any strong reasons why not to use this pattern, but other patterns are recommended in the documentation, without elaboration.
What does aes()
do?
A ggplot has two types of aesthetics:
- the default one (typically supplied inside
ggplot()
), and geom_*()
specific aesthetics
If inherit.aes = TRUE
is set inside the geoms, then these two types of aesthetics are combined in the final plot. If the default aesthetic is not set, then the geom_*
specific aesthetics must be set.
Using ggplot(df) aes(x, y)
changes the default aesthetic.
This is documented in ?" .gg"
:
An aes() object replaces the default aesthetics.
Are there any reasons not to use it?
I cannot see any strong reasons not to. However, in the documentation of ?ggplot
it is stated that:
There are three common ways to invoke ggplot():
- ggplot(df, aes(x, y, other aesthetics))
- ggplot(df)
- ggplot()
The first method is recommended if all layers use the same data and the same set of aesthetics.
As far as I can see, the typical use case for aes()
is when all layers use the same aesthetics. So the documentation recommend the usual pattern ggplot(df, aes(x, y, other aesthetics))
, but I cannot find an elaboration of why.
Further: even though the plots look identical, the objects returned by ggplot(df, aes()
and ggplot(df) aes()
are not identical, so there might be some edge cases where one pattern would lead to errors or a different plot.
You can see the many small differences with this code:
library(ggplot2)
a <- ggplot(mtcars, aes(hp, mpg)) geom_point()
b <- ggplot(mtcars) aes(hp, mpg) geom_point()
waldo::compare(a, b, x_arg = "a", y_arg = "b")