Home > OS >  How to remove low frequency bins in histogram
How to remove low frequency bins in histogram

Time:05-27

Let's say I've a data frame containing an array of numbers which I want to visualise in a histogram. What I want to achieve is to show only the bins containing more than let's say 50 observations.

Step 1

set.seed(10)
x <- data.frame(x = rnorm(1000, 50, 2))
p <- 
  x %>% 
  ggplot(., aes(x))  
  geom_histogram()

p

enter image description here

Step 2

pg <- ggplot_build(p)

pg$data[[1]]

As a check when I print the pg$data[[1]] I'd like to have only rows where count >= 50.

Thank you

CodePudding user response:

library(ggplot2)

ggplot(x, aes(x=x, y = ifelse(..count.. > 50, ..count.., 0)))  
  geom_histogram(bins=30) 

enter image description here

With this code you can see the counts of the deleted bins:

library(ggplot2)

ggplot(x, aes(x=x, y = ifelse(..count.. > 50, ..count.., 0)))  
  geom_histogram(bins=30, fill="green", color="grey")  
  stat_bin(aes(label=..count..), geom="text", vjust = -0.7)

enter image description here

  • Related