I want to include the mean inside the boxplot but apparently, the mean is not located at the position where it is supposed to be. If I calculate the mean from the data it is 16.2, which would equal 1.2 at the log scale. I tried various things, e.g., changing the position of the stat_summary
function before or after the transformation but this does not work.
Help is much appreciated!
Yours,
Kristof
Code:
Data:
df <- c(2e-05, 0.38, 0.63, 0.98, 0.04, 0.1, 0.16, 0.83, 0.17, 0.09, 0.48, 4.36, 0.83, 0.2, 0.32, 0.44, 0.22, 0.23, 0.89, 0.23, 1.1, 0.62, 5, 340, 47) %>% as.tibble()
Output:
df %>%
ggplot(aes(x = 0, y = value))
geom_boxplot(width = .12, outlier.color = NA)
stat_summary(fun=mean, geom="point", shape=21, size=3, color="black", fill="grey")
labs(
x = "",
y = "Particle counts (P/kg)"
)
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x), labels = trans_format("log10", math_format(10^.x)))
CodePudding user response:
The mean calculated by stat_summary
is the mean of log10(value)
, not of value
.
Below I propose to define a new function my_mean
for a correct calculation of the average value.
library(ggplot2)
library(dplyr)
library(tibble)
library(scales)
df <- c(2e-05, 0.38, 0.63, 0.98, 0.04, 0.1, 0.16,
0.83, 0.17, 0.09, 0.48, 4.36, 0.83, 0.2, 0.32, 0.44,
0.22, 0.23, 0.89, 0.23, 1.1, 0.62, 5, 340, 47) %>% as.tibble()
# Define the mean function
my_mean <- function(x) {
log10(mean(10^x))
}
df %>%
ggplot(aes(x = 0, y = value))
geom_boxplot(width = .12, outlier.color = NA)
stat_summary(fun=my_mean, geom="point", shape=21, size=3, color="black", fill="grey")
labs(
x = "",
y = "Particle counts (P/kg)"
)
scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
labels = trans_format("log10", math_format(10^.x)))