Home > Mobile >  Create geom_vline for mean value in a density plot, for a new variable in the dataframe, without cre
Create geom_vline for mean value in a density plot, for a new variable in the dataframe, without cre

Time:08-01

Here I am looking at the mtcars dataset.

I create a density plot for the wt variable using the ggplot2 package. I also the geom_vline() layer to add a vertical line for the mean of wt.

ggplot(mtcars, aes(x = wt))   
  geom_density()   
  geom_vline(xintercept = mean(mtcars$wt))

enter image description here

I then switch the syntax a bit to start with the start with the dataframe and then move to ggplot. I do this because I want to add a step where I create a new variable. In this example, I create wt2 which is wt ^ 2.

mtcars %>%
  mutate(wt2 = wt ^ 2) %>%
  ggplot(aes(wt2))   
  geom_density()

enter image description here

I find that I am no longer able to add a geom_vline() layer in the same way that I did before, using this new syntax. Is there something I am doing wrong?

mtcars %>%
  mutate(wt2 = wt ^ 2) %>%
  ggplot(aes(wt2))   
  geom_density()  
  geom_vline(xintercept = mean(mtcars$wt2))

enter image description here

Now, the code below creates the graph I want, but only after creating a new table, which I want to avoid. I want to avoid this workflow because I'm working with a large dataset and it's creating memory issues to create new tables / I don't seem to have space in the environment I'm using.

mtcars_new_df <- mtcars %>% mutate(wt2 = wt ^ 2)

mtcars_new_df %>% ggplot(aes(wt2)) geom_density() geom_vline(xintercept = mean(mtcars$wt2))

enter image description here

The reason I want to avoid a workflow where I create a new dataframe is because of memory and time issues. (I'm using a dataset much larger than the mtcars dataset.)

CodePudding user response:

One option would be to use stat_summary to compute the mean of the variable mapped on x like so:

Note: As stat_summary by default applies on the variable mapped on y we have to set orientation="y" to compute the mean of x. Additionally stat_summary requires both an x and an y so I set that latter to 0 which should be fine for geom_density.

library(ggplot2)
library(dplyr, warn = FALSE)

mtcars %>%
  mutate(wt2 = wt ^ 2) %>%
  ggplot(aes(wt2))   
  geom_density()  
  stat_summary(aes(xintercept = ..x.., y = 0), fun = mean, geom = "vline", orientation = "y")

A second example with the original wt:


mtcars %>%
  ggplot(aes(wt))   
  geom_density()  
  stat_summary(aes(xintercept = after_stat(x), y = 0), fun = mean, geom = "vline", orientation = "y")

CodePudding user response:

Here is how we could do it: The idea is to get access to the intermediate save!

mtcars %>%
  mutate(wt2 = wt ^ 2) %>%
  {. ->>intermediateResult} %>%   # this saves intermediate 
  ggplot(aes(wt2))   
  geom_density()  
  geom_vline(xintercept = mean(intermediateResult$wt2))

enter image description here

  • Related