Home > Mobile >  calculating mean and variance using monte carlo methods, having Density Kernel
calculating mean and variance using monte carlo methods, having Density Kernel

Time:11-06

assuming the density kernel to be equal to be:https://i.stack.imgur.com/irjRs.png, what monte carlo methods can I use to estimate the mean and variance of the destribuation?

CodePudding user response:

We can use numerical methods here. First of all, we create a function to represent your probability density function (though this is not yet scaled so that its integral is 1):

pdf <- function(x) x^2 * exp(-x^2/4)

plot(pdf, xlim = c(0, 10))

We can see that almost all of the area under the curve occurs where x < 10, so if we find the integral at, say, x = 100, we should have a very accurate scaling factor to generate a true pdf:

integrate(pdf, 0, 100)$value
#> [1] 3.544908

So now we can generate a genuine pdf:

pdf <- function(x)  x^2 * exp(-x^2/4) / 3.544908

plot(pdf, xlim = c(0, 10))

Now that we have a pdf, we can create a cdf with numerical integration:

cdf <- function(x) sapply(x, \(i) integrate(pdf, 0, i)$value)

plot(cdf, xlim = c(0, 10))

The inverse of the cdf is what we need to be able to convert a sample taken from a uniform distribution between 0 and 1 into a sample drawn from our new distribution. We can create an inverse function using uniroot to find where the output of our cdf matches an arbitrary number between 0 and 1:

inverse_cdf <- function(p) 
{
  sapply(p, function(i) {
              uniroot(function(a) {cdf(a) - i}, c(0, 100))$root
  })
}

The inverse cdf looks like this:

plot(inverse_cdf, xlim = c(0, 0.99))

We are now ready to draw a sample from our distribution:

set.seed(1) # Makes this draw reproducible

x_sample <- inverse_cdf(runif(1000))

Now we can plot a histogram of our sample and ensure it matches the pdf:

hist(x_sample, freq = FALSE)

plot(function(x) pdf(x), add = TRUE, xlim = c(0, 6))

Now we are confident that we have a sample drawn from x, we can use the sample mean and standard deviation as estimates for the distribution's mean and standard deviation:

mean(x_sample)
#> [1] 2.264438

sd(x_sample)
#> [1] 0.9625839

We can increase the accuracy of these estimates by taking larger and larger samples in our call to inverse_cdf(runif(1000)), by increasing the 1000 to a larger number.

Created on 2021-11-06 by the reprex package (v2.0.0)

  • Related