Home > Software design >  Romove outliers from stat_summary in ggplot2
Romove outliers from stat_summary in ggplot2

Time:05-18

I have this part of code to produce boxplot with my data:

p <- ggplot(meltData, aes(x=variable, y=value))   
  geom_boxplot()   geom_boxplot(outlier.colour="red", outlier.shape=1,outlier.size=2) 
stat_summary(geom="text", fun=quantile,
             aes(label=sprintf("%1.1f", ..y..), color=factor(variable)),
             position=position_nudge(x=0.0), size=3.5,show_guide = FALSE) 
  ggtitle("Species measurements") 
  ggeasy::easy_center_title()
p

and I have this output: enter image description here

I want to be able to see uppper and lower whisker numbers on my boxplot as maximum and minimum values (and not the outliers numbers). For example, on the 5th boxplot we can see that the max number is 72, but this is an outlier and the max should be at 56 approximately.

CodePudding user response:

If I understand your purpose correctly, you want to create boxplots along with texts that show the upper and lower whisker numbers and no outliers should be shown in the plots. If that's true, then I agree with @Death Metal that you might want to filter the outliers per category.

However, because you don't provide a reproducible data, here is a dummy data similar to your data.

dat <- data.frame(var.A = c(iris$Sepal.Length, c(20,21,22)), 
                  var.B = c(iris$Petal.Length, c(20,21,22)))
meltData <- dat %>% pivot_longer(cols = c(var.A, var.B), 
                                 values_to = "value", 
                                 names_to = "variable")

ggplot(meltData, aes(x=variable, y=value))   geom_boxplot()

which clearly shows outliers

enter image description here

Here is on of the ways to filter the outliers before applying boxplots:

meltData %>% group_by(variable) %>%
     filter(value != (boxplot(value))$out) %>% 
     ggplot(aes(x = variable, y = value))   
     geom_boxplot()   stat_summary(geom="text", 
                                   fun=quantile,aes(label=sprintf("%1.1f", ..y..), 
                                                    color=factor(variable)),
                                   position=position_nudge(x=0.0), 
                                   size=3.5,show_guide = FALSE) 
     ggtitle("Species measurements") 
     ggeasy::easy_center_title()
#Warning message:
#`show_guide` has been deprecated. Please use `show.legend` instead. 

The result:

enter image description here

  • Related