Home > Blockchain >  Python accelerate singular value decomposition
Python accelerate singular value decomposition

Time:12-04

I want to compute the singular value decomposition of each slice of a 3D matrix.

I used numpy and scipy to compute the SVD, but both of them are significantly slower than the MATLAB implementation. While the numpy and scipy versions take around 7 s, the MATLAB version takes 0.7 s only.

Is there a way to accelerate the SVD computation in Python?

Python

import time
import scipy.linalg
import numpy.linalg

A = np.random.rand(100, 100, 1000)   1j * np.random.rand(100, 100, 1000)
S = np.empty((A.shape[2], min(A.shape[0:1])))

t1 = time.time()
for i in range(A.shape[2]):
    S[i, :] = numpy.linalg.svd(A[:, :, i], compute_uv=False)
print("[numpy] Elapsed time: {:.3f} s".format(time.time() - t1))

t1 = time.time()
for i in range(A.shape[2]):
    S[i, :] = scipy.linalg.svdvals(A[:, :, i])
print("[scipy] Elapsed time: {:.3f} s".format(time.time() - t1))

# [numpy] Elapsed time: 7.137 s
# [scipy] Elapsed time: 7.435 s

MATLAB

A = randn(100, 100, 1000)   1j * randn(100, 100, 1000);
S = nan(size(A,3), min(size(A, [1 2])));
tic;
for i = 1:size(A, 3)
    S(i, :) = svd(A(:,:,i));
end
toc;
% Elapsed time is 0.702556 seconds.

This is the output of np.show_config():

blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    library_dirs = ['D:\\a\\1\\s\\numpy\\build\\openblas_info']
    libraries = ['openblas_info']
    language = f77
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    library_dirs = ['D:\\a\\1\\s\\numpy\\build\\openblas_info']
    libraries = ['openblas_info']
    language = f77
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    library_dirs = ['D:\\a\\1\\s\\numpy\\build\\openblas_lapack_info']
    libraries = ['openblas_lapack_info']
    language = f77
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    library_dirs = ['D:\\a\\1\\s\\numpy\\build\\openblas_lapack_info']
    libraries = ['openblas_lapack_info']
    language = f77
    define_macros = [('HAVE_CBLAS', None)]
Supported SIMD extensions in this NumPy install:
    baseline = SSE,SSE2,SSE3
    found = SSSE3,SSE41,POPCNT,SSE42,AVX,F16C,FMA3,AVX2,AVX512F,AVX512CD,AVX512_SKX
    not found = AVX512_CLX,AVX512_CNL
None

CodePudding user response:

I'm not certain this fixes your issue, but you don't need to loop anything because np.linalg.svd() already handles n-dimensional arrays. (Even if it didn't, you basically never need loops in NumPy.)

This is how I'd do what you're doing:

import time
import numpy as np
import numpy.linalg as la

shape = (100, 100, 1000)
rng = np.random.default_rng(42)
A = rng.random(shape)   1j * rng.random(shape)

t1 = time.perf_counter()
S = la.svd(A.T, compute_uv=False)
t2 = time.perf_counter()

print(f"[numpy] Elapsed time: {t2 - t1:.3f} s")

On my computer this takes 0.83 s (using the Intel MKL). I didn't try your MATLAB code though, so I'm not sure this will speed things up for you.

CodePudding user response:

For computers with Intel Math Kernel Library (MKL) support, the computation time can be significantly reduced by installing a NumPy/SciPy version that uses MKL. Thanks to @joni and @kwinkunks for that information

In my case, the computation time reduced from 7 seconds with OpenBLAS to 0.68 seconds with Intel MKL

This can be done by building NumPy or SciPy by source according to the following tutorial: Build NumPy/SciPy from Source

Alternatively, installing a Anaconda platform that comes with prebuild MKL-supported NumPy and SciPy versions simplifies that.

  • Related