What is an efficient way to extract eigenvalues from 3x3 matrix elements stored in a 6xN `numpy` arr-CodePudding

Given a 3 x 3 matrix S (mechanical stress tensor) of the form

    | sxx sxy szx |
S = | sxy syy syz |
    | szx syz szz |

I can obtain the 3 eigenvalues (principal stresses) from

import numpy as np

S = np.array([[sxx, sxy, szx],
              [sxy, syy, syz],
              [szx, syz, szz]])

e_val, e_vec = np.linalg.eig(S)

principal_stress = np.sort(e_val)  # 3 principal components sorted from min to max

# with sxx=100, syy=-50, sxy=75, szz=szx=syz=0 the result is 
# (-81.07, 0.0, 131.07)

Now, I have a large 2d array in which each column contains the 6 elements (sxx, syy, szz, sxy, szx, syz) for any given "point of interest (poi)". There can be hundreds of thousands columns (if not millions) in this array...

            |  sxx_1  sxx_2  ...  sxx_n  |
            |  syy_1  syy_2  ...  syy_n  |
S_by_poi =  |  szz_1  szz_2  ...  szz_n  |
            |  sxy_1  sxy_2  ...  sxy_n  |
            |  szx_1  szx_2  ...  szx_n  |
            |  syz_1  syz_2  ...  syz_n  |

What would be the fastest way to compute the 3 eigenvalues, sort them and put them into an array P, like this:

            |  pmin_1  pmin_2  ...  pmin_n  |              
P_by_poi =  |  pmid_1  pmid_2  ...  pmid_n  |              
            |  pmax_1  pmax_2  ...  pmax_n  |

Can I somehow avoid looping over the columns of S in Python? That does not appear to be a good or performant solution ...

CodePudding user response：

np.linalg.eig and np.sort are slow for computing small matrices. While they might be a vectorized way to do that, I do not expect it to be fast because the implementation of np.linalg.eig would not be efficient anyway in this specific case.

One efficient solution is to implement this in Numba or Cython based on the analytic formula of the eigenvalues of a 3x3 symmetric matrix. The sort can be efficiently done using sorting networks. The analytic formula can be computed by mathematical tools like Wolfram Alpha. The expression can then be factorized manually though compilers are able to apply a common sub-expression elimination. Note variables are complex numbers and I am not sure the eigenvalues are always real ones in your case. I assumed that because you want to sort them which require values to be real numbers (or imaginary ones). Here is the resulting implementation:

import numba as nb

@nb.njit
def sorted_eigenvalues(v):
    assert v.shape == (3,3)
    a, b, c, d, e, f = v[0,0], v[1,1], v[2,2], v[0,1], v[0,2], v[1,2]
    assert v[1,0] == d and v[2,0] == e and v[2,1] == f

    # Analytic eigenvalues solution of the 3x3 input matrix
    tmp1 = -a**2   a*b   a*c - b**2   b*c - c**2 - 3*d**2 - 3*e**2 - 3*f**2
    tmp2 = 2*a**3 - 3*a**2*b - 3*a**2*c - 3*a*b**2   12*a*b*c - 3*a*c**2   9*a*d**2   9*a*e**2 - 18*a*f**2   2*b**3 - 3*b**2*c - 3*b*c**2   9*b*d**2 - 18*b*e**2   9*b*f**2   2*c**3 - 18*c*d**2   9*c*e**2   9*c*f**2   54*d*e*f
    tmp3 = np.sqrt((4*tmp1**3   tmp2**2)   0j)
    tmp4 = (tmp2   tmp3) ** (1/3)
    tmp5 = 1/3*(a   b   c)
    tmp6 = 1   1j*np.sqrt(3)
    tmp7 = 1 - 1j*np.sqrt(3)
    eigv1 = tmp4/(3*2**(1/3)) - (2**(1/3)*tmp1)/(3*tmp4)   tmp5
    eigv2 = (tmp6*tmp1)/(3*2**(2/3)*tmp4) - (tmp7*tmp4)/(6*2**(1/3))   tmp5
    eigv3 = (tmp7*tmp1)/(3*2**(2/3)*tmp4) - (tmp6*tmp4)/(6*2**(1/3))   tmp5

    # Assume the values are real ones and remove the FP rounding errors
    eigv1 = np.real(eigv1)
    eigv2 = np.real(eigv2)
    eigv3 = np.real(eigv3)

    # Sort the eigenvalues using a fast sorting network
    eigv1, eigv2 = min(eigv1, eigv2), max(eigv1, eigv2)
    eigv2, eigv3 = min(eigv2, eigv3), max(eigv2, eigv3)
    eigv1, eigv2 = min(eigv1, eigv2), max(eigv1, eigv2)

    return eigv1, eigv2, eigv3

This functions is about 50 times faster for computing one 3x3 symmetric matrix on my machine (it takes only 0.5 us to do the computation in Numba while the overhead of calling a Numba function is about 0.2-0.3 us).

The second step is to loop over the columns of the S_by_poi array in a Numba function and call the above function. For better performance, I advise you to directly pass the parameter a, b, c, d, e, f to the sorted_eigenvalues function so to avoid the expensive creation of Numpy arrays (or even memory loads/stores). I expect this to be 100 times faster in a loop (since the Numba function call overhead is removed). You can even parallelize the computation using prange and parallel=True in Numba resulting in an even faster computation (several hundred times faster if not several thousand times faster on high-end computing servers)!