Home > Software engineering >  Illustrate standard deviation in histogram
Illustrate standard deviation in histogram

Time:03-22

Consider the following simple example:

# E. Musk in Grunheide 
set.seed(22032022) 

# generate random numbers 
randomNumbers <- rnorm(n=1000, mean=10, sd=10)

# empirical sd 
sd(randomNumbers)
#> [1] 10.34369

# histogram 
hist(randomNumbers, probability = TRUE, main = "", breaks = 50)

# just for illusatration purpose 
###
# empirical density 
lines(density(randomNumbers), col='black', lwd = 2)
# theortical density 
curve(dnorm(x, mean = 10, sd = 10), col = "blue", lwd=2, add=TRUE)
###

Created on 2022-03-22 by the enter image description here

CodePudding user response:

Here is a ggplot solution. First calculate mean and sd, and save the values in different vectors. Then use an ifelse statement to categorise the values into "Within range" and "Outside range", fill them with different colours.

Blue line represents the normal distribution stated in your question, and black line represents the density graph of the histogram we're plotting.

library(ggplot2)

set.seed(22032022) 

# generate random numbers 
randomNumbers <- rnorm(n=1000, mean=10, sd=10)

randomNumbers_mean <- mean(randomNumbers)
randomNumbers_sd <- sd(randomNumbers)

ggplot(data.frame(randomNumbers = randomNumbers), aes(randomNumbers))  
  geom_histogram(aes(
    fill = ifelse(
      randomNumbers > randomNumbers_mean   randomNumbers_sd |
        randomNumbers < randomNumbers_mean - randomNumbers_sd,
      "Outside range",
      "Within range"
    )
  ), 
  binwidth = 1, col = "gray")  
  geom_density(aes(y = ..count..))   
  stat_function(fun = function(x) dnorm(x, mean = 10, sd = 10) * 1000,
                color = "blue")  
  labs(fill = "Data")

Created on 2022-03-22 by the enter image description here

  • Related