In base R, we can use the function hist()
to create the density histogram of a given variable, say x
. If we write:
h <- hist(x, freq=FALSE)
then, h$mids
is a vector containing the mid-point value for each bin, and h$density
contains the density for each bin. I want to draw my density histogram with ggplot2 with geom_histogram()
.
Is there any way to retrieve the similar values (mid-point and density of each bin) from the ggplot2 functions?
CodePudding user response:
You can accomplish this by creating a histogram using ggplot() geom_histogram()
, and then use ggplot_build()
to extract the bin midpoints, min and max values, densities, counts, etc.
Here's a simple example using the built-in iris
dataset:
library(ggplot2)
# make a histogram using the iris dataset and ggplot()
h <- ggplot(data = iris)
geom_histogram(mapping = aes(x=Petal.Width),
bins = 11)
# extract the histogram's underlying features using ggplot_build()
vals <- ggplot_build(h)$data[[1]]
# print the bin midpoints
vals$x
## 0.00 0.24 0.48 0.72 0.96 1.20 1.44 1.68 1.92 2.16 2.40
# print the bin densities
vals$density
## 0.1388889 1.0000000 0.2500000 0.0000000 0.1944444 0.5833333 0.5555556 0.5000000 0.3055556 0.2500000 0.3888889