I need to create a custom boxplot in R, which uses the quantiles 0.05, 0.20, 0.50, 0.80 and 0.95 that make up the box and whiskers, rather than the default.
The default plot was generated using this code:
ggplot(data, aes(Site, LOG10Val))
geom_boxplot()
To specify the custom bounds of the boxplots, the code I used was:
ggplot(data, aes(Site, LOG10Val))
stat_summary(geom = "boxplot",
fun.data = function(x) setNames(quantile(x, c(0.05, 0.2, 0.5, 0.8, 0.95)),
c("ymin", "lower", "middle", "upper", "ymax")),
position = "dodge")
the plot becomes:
Is there a way to reintroduce the outliers (ie >95th percentile) into the custom boxplot?
Thanks.
Edit: my data structure is as follows:
# A tibble: 6 x 5
Date Site Analyte Value LOG10Val
<date> <fct> <fct> <dbl> <dbl>
1 2014-01-10 E Ammonia_mg.L 0.02 -1.70
2 2014-01-10 C Ammonia_mg.L 0.01 -2
3 2014-01-10 D Ammonia_mg.L 0.015 -1.82
4 2014-01-31 E Ammonia_mg.L 0.01 -2
5 2014-01-31 C Ammonia_mg.L 0.01 -2
6 2014-01-31 D Ammonia_mg.L 0.01 -2
CodePudding user response:
One option to achieve your desired result would be to add the outliers via a second stat_summary
layer.
Making use of iris
as example data:
library(ggplot2)
ggplot(iris, aes(x = Species, y = Sepal.Length))
stat_summary(
geom = "boxplot",
fun.data = function(x) {
setNames(
quantile(x, c(0.05, 0.2, 0.5, 0.8, 0.95)),
c("ymin", "lower", "middle", "upper", "ymax")
)
}
)
stat_summary(geom = "point", fun = function(x) {
outlier_high <- x > quantile(x, .95)
outlier_low <- x < quantile(x, .05)
ifelse(outlier_high | outlier_low, x, NA)
}, na.rm = TRUE)