Home > Software engineering >  scipy.stats.norm for array of values with different accuracy in different method
scipy.stats.norm for array of values with different accuracy in different method

Time:04-15

Generate two arrays:

np.random.seed(1)
x = np.random.rand(30, 2)
np.random.seed(2)
x_test = np.random.rand(5,2)

Caluclate scipy.stats.norm axis by axis:

gx0 = scipy.stats.norm(np.mean(x[:,0]), np.std(x[:,0])).pdf(x_test[:,0])
gx1 = scipy.stats.norm(np.mean(x[:,1]), np.std(x[:,1])).pdf(x_test[:,1])

and get:

gx0 = array([1.29928091, 1.1344507 , 1.30920536, 1.10709298, 1.26903949])
gx1 = array([0.29941644, 1.36808598, 1.13817727, 1.34149231, 0.95054596])

Calculate using NumPy broadcasting

gx = scipy.stats.norm(np.mean(x, axis = 0), np.std(x, axis = 0)).pdf(x_test)

and get:

gx = array([[1.29928091, 0.29941644],
       [1.1344507 , 1.36808598],
       [1.30920536, 1.13817727],
       [1.10709298, 1.34149231],
       [1.26903949, 0.95054596]])

gx[:,0] and gx0 look like the same, but subtracting one from another gx[:,0] - gx0 will get:

array([-4.44089210e-16, -2.22044605e-16, -4.44089210e-16,  0.00000000e 00,
        0.00000000e 00])

Why is that?

CodePudding user response:

Not sure why they calculate the answer to different precisions, but converting the input arrays to 128 bit floats solves the problem:

np.random.seed(1)
x = np.random.rand(30, 2).astype(np.float128)
np.random.seed(2)
x_test = np.random.rand(5,2).astype(np.float128)

...

print(gx[:,0] - gx0)

results in:

[0. 0. 0. 0. 0.]
  • Related