ggplot - how to show different cutoffs using facets in ggplot WITHOUT calculating an intermediate su-CodePudding

This can be considered a follow-up to

However, I noticed that the displayed xintercept is the same in all facets, around 3.25. I suspect that the function is calculating the overall mean of wt and then plotting it, instead of plotting the mean of wt for all different levels of faceting variables. How can I achieve this?

Calculating an intermediate summary dataframe which is then passed as data to geom_vline is not an acceptable answer because I am trying to learn how to invoke the data which was passed to ggplot from within geom_line. If that is completely impossible to do without the intermediate dataframe, then your answer should be just "that is completely impossible to do without the intermediate dataframe". Thanks.

CodePudding user response：

It is possible to access the data that was passed to the initial ggplot call, using the data parameter of a geom or stat layer. If one uses tidyverse/magrittr-style pipe notation, with . being used a placeholder for the data being passed to the ggplot call, then the summary data frame can be created on the fly.

This isn't terribly useful in the example here, but can be invaluable if the ggplot is at the end of a complex data wrangling pipe.

I was puzzled by this feature when I first discovered it, since there is no data being piped through the operator, and naively, one would not expect this syntax to work. However, ggplot recognizes this notation as a functional sequence, which it applies to the main ggplot2 data member when the layer is being built.

The upshot is that you can do:

ggplot(mtcars, aes(x = wt, y = qsec))  
  geom_point()  
  facet_grid2(gear ~ vs, scales = "free", axes = "x", independent = "all")  
  geom_vline(data = . %>% group_by(gear, vs) %>% summarize(wt = mean(wt)),
             aes(xintercept = wt))

CodePudding user response：

Using an intermediate dataframe or the approach outlined by @AllanCameron which computes the intermediate dataframe on the fly would be my preferred way to achieve the desired result.

But just as a reference I would point out that a second option would be the gggrid package which is meant to overcome some of the limitations of ggplot2 and in the particular case allows to draw a mean vline per panel. However, it requires some knowledge of grid and becomes a bit cumbersome if you also want to take account of other aesthetics, like colour or ... . That's why IMHO the approach outlined by Allan is the "best" approach for your use case.

library(ggplot2)
library(gggrid)
#> Loading required package: grid

mean_vline <- function(data, coords) {
  x <- xend <- mean(coords$xintercept)
  segmentsGrob(x, unit(0, "npc"), xend, unit(1, "npc"))
}

base <- ggplot(mtcars, aes(x = wt, y = qsec))  
  geom_point()  
  ggh4x::facet_grid2(gear ~ vs, scales = "free", axes = "x", independent = "all") 

base  
  grid_panel(mean_vline, aes(xintercept = wt))