Home > OS >  How to correct error when sampling Cauchy distribution
How to correct error when sampling Cauchy distribution

Time:05-06

I'm trying to sample values from a Cauchy distribution, using what is called (I think) inverse transform sampling. My program is as follows:

from numpy import*
from matplotlib.pyplot import*
from scipy.interpolate import interp1d

def f(x):
    return 1 / (pi*(1 x**2)) # Cauchy

x = linspace(-6, 6,100)
dx = x[1]-x[0]
cdf = cumsum(f(x))*dx

plt.plot(x,cdf)
plt.xlabel('X')
plt.ylabel('CDF')
plt.show()

from_cdf_to_x = interp1d(cdf, x)
cdf_new = np.random.uniform(0,1,100)
x_sampled = from_cdf_to_x(cdf_new) 

The plot looks fine, but I get the following error in the 2nd part which I don't understand: A value in x_new is above the interpolation range. What does it mean, in my case, and what would be the proper way to correct it? No matter how much I search, I don't see what is not correct in the program I made...

CodePudding user response:

The range of your array cdf_new is from 0 to 1.0, but the Cauchy cumulative distribution asymptotically approaches 1 without reaching it, thus the interpolation doesn't cover the full range of the uniform distribution. You need to reduce the top range of cdf_new and/or increase the range of x when you generate the Cauchy cumulative distribution

import numpy as np
import matplotlib.pyplot as plt
from scipy.interpolate import interp1d

def f(x):
    return 1 / (np.pi*(1 x**2)) # Cauchy

x = np.linspace(-100, 100,1000)
dx = x[1]-x[0]
cdf = np.cumsum(f(x))*dx

plt.plot(x,cdf)
plt.xlabel('X')
plt.ylabel('CDF')
plt.show()

from_cdf_to_x = interp1d(cdf, x)
cdf_new = np.random.uniform(0.05,0.95,10000)
x_sampled = from_cdf_to_x(cdf_new)
plt.hist(x_sampled, bins=20)
plt.show()

Here is the normal distribution that you were expecting:

enter image description here

  • Related