How to speed up applying scipy integrate of two 2D arrays?-CodePudding

I'm applying an integration function using scipy.integrate to two 2D arrays. Here's the example:

from scipy import integrate
import numpy as np

def integrate_lno2(top, bottom, peak_height, peak_width):
    return integrate.quad(lambda x: np.exp(-np.power(x - peak_height, 2)/(2*np.power(peak_width, 2))), top, bottom)[0]

# change row and col to test speed
row = 100; col = 100; peak_height=300; peak_width=60

top = np.linspace(100, 200, row*col).reshape(row, col)
bottom = np.linspace(800, 900, row*col).reshape(row, col)

res = np.zeros((row, col))

for i in range(row):
    for j in range(col):
        res[i, j] = integrate_lno2(top[i, j], bottom[i, j], peak_height, peak_width)

If the shape of 2D arrays increase, the for loop can be slow. I have found the numba integrand example, however it doesn't accept the upper and lower limit.

CodePudding user response：

Like in this previous answer, you can use Numba to speed up the lambda calls that are very slow due to big Numpy overheads (Numpy is not optimized to operate on scalar and is very slow to do that). Even better: you can tell to Numba to generate a C function which can be called directly from Scipy with a very small overhead (since it almost completely remove the overhead of the slow CPython interpreter). You can also also pre-compute the division by a variable and convert it to a multiplication (faster).

Here is the resulting code:

import numba as nb
import numpy as np
import scipy as sp

factor = -1.0 / (2 * np.power(peak_width, 2))

# change row and col to test speed
row = 100; col = 100; peak_height=300; peak_width=60

@nb.cfunc('float64(float64)')
def compute_numba(x):
    return np.exp(np.power(x - peak_height, 2) * factor)

compute_c = sp.LowLevelCallable(compute_numba.ctypes)

def integrate_lno2(top, bottom):
    return sp.integrate.quad(compute_c, top, bottom)[0]

top = np.linspace(100, 200, row*col).reshape(row, col)
bottom = np.linspace(800, 900, row*col).reshape(row, col)

res = np.zeros((row, col))

for i in range(row):
    for j in range(col):
        res[i, j] = integrate_lno2(top[i, j], bottom[i, j])

The computing loop is roughly 100 times faster on my machine.