Is it a problem of stability of the matrix calculation in python?-CodePudding

I recently encountered a problem of the inaccuracy of the matrix product/multiplication in NumPy. See my example below also here https://trinket.io/python3/6a4c22e450

import numpy as np

para = np.array([[ 3.28522453e 08, -1.36339334e 08, 1.36339334e 08],
    [-1.36339334e 08,  5.65818682e 07, -5.65818682e 07],
    [ 1.36339334e 08, -5.65818672e 07,  5.65818682e 07]])

in1 = np.array([[ 285.91695469],
    [ 262.3       ],
    [-426.64380594]])

in2 = np.array([[ 285.91695537],
    [ 262.3       ],
    [-426.64380443]])

(in1 - in2)/in1
>>> array([[-2.37831286e-09],
    [ 0.00000000e 00],
    [ 3.53925214e-09]])

The difference between in1 and in2 is very small, which is ~10^-9

res1 = para @ in1
>>> array([[-356.2361908 ],
    [ 443.16068268],
    [-180.86068344]])

res2 = para @ in2
>>> array([[ 73.03147125],
   [265.01131439],
   [ -2.71131516]])

but after the matrix multiplication, why does the difference between the output res1 and res2 change so much?

(res1 - res2)/res1
>>> array([[1.20500857],
   [0.40199723],
   [0.98500882]])

CodePudding user response：

This is not a bug; it is to be expected with a matrix such as yours.

Your matrix (which is symmetric) has one large and two small eigenvalues:

In [34]: evals, evecs = np.linalg.eigh(para)

In [35]: evals
Out[35]: array([-1.06130078e-01,  1.00000000e 00,  4.41686189e 08])

Because the matrix is symmetric, it can be diagonalized with an orthonormal basis. That just means that we can define a new coordinate system in which the matrix is diagonal, and the diagonal values are those eigenvalues. The effect of multiplying the matrix by a vector in these coordinates is to simply multiply each coordinate by the corresponding eigenvalue, i.e. the first coordinate is multiplied by -0.106, the second coordinate doesn't change, and the third coordinate is multiplied by the large factor 4.4e8.

The reason you get such a drastic change when multiplying the original matrix para by in1 and in2 is that, in the new coordinates, the third component of the transformed in1 is positive, and the third component of the transformed in2 is negative. (That is, the points are on opposite sides of the 2-d eigenspace associated with the two smaller eigenvalues.) There are several ways to find these transformed coordinates; one is to compute inv(V)@x, where V is the matrix of eigenvectors:

In [36]: np.linalg.solve(evecs, in1)
Out[36]: 
array([[ 5.64863071e 02],
       [-1.16208620e 02],
       [ 8.55527517e-07]])

In [37]: np.linalg.solve(evecs, in2)
Out[37]: 
array([[ 5.64863070e 02],
       [-1.16208619e 02],
       [-2.71381169e-07]])

Note the different signs of the third components. The values are small, but when you multiply by the diagonal matrix, they are multiplied by 4.4e8, giving 377.87 and -119.86, respectively. That large change shows up as the results that you observed in the original coordinates.

For a rougher calculation: note that the elements of para are ~10^8, so multiplication on that order of magnitude occurs when you compute para @ x. It is not surprising then, that given the relative differences between in1 and in2 are ~10^-9, the relative differences of res1 and res2 will be ~10^-9 * ~10^8 or ~0.1. (Your calculated relative errors were [1.2, 0.4, 0.99], so the rough estimate is in the right ballpark.)

CodePudding user response：

This looks like a bug ... numpy is written in C, so this could be an issue of casting number into smaller float, which causes big floating point error in this case