A way to speed up the smoothing function-CodePudding

I have a function that is used for smoothing a curve by taking a moving average over a factor of 2. However, in it's current form the function is slow because of the loops. I have added numba to increase the speed but still it's slow. Any suggestions on how I could improve the speed?

from numba import prange, jit
@jit(nopython=True, parallel=True)      
def smoothing_function(x,y, window=2, pad = 1):
    len_x    = len(x)
    max_x    = np.max(x)
    xoutmid  = np.full(len_x, np.nan)
    xoutmean = np.full(len_x, np.nan)
    yout     = np.full(len_x, np.nan)
    
    for i in prange(len_x):
        x0 = x[i]
        xf = window*x[i]
        
        if xf < max_x:
            e = np.where(x  == x.flat[np.abs(x - xf).argmin()])[0][0]
            if e<len(x):
                yout[i]     = np.nanmean(y[i:e])
                xoutmid[i]  = x[i]   np.log10(0.5) * (x[i] - x[e])
                xoutmean[i] = np.nanmean(x[i:e])
    return xoutmid, xoutmean, yout 
# Working example

f = lambda x: x**(-1.7)*2*np.random.rand(len(x))
x = np.logspace(np.log10(1e-5), np.log10(1), 1000)
xvals, yvals  = x, f(x)


%timeit res =smoothing_function(xvals, yvals, window=2, pad = 1)

# plot results
plt.loglog(xvals, yvals)
plt.loglog(res[1], res[2])

CodePudding user response：

Issue is that you are computing end index (e) very inefficiently. If you use the fact that x is in logspace, this can be done much faster since you know the distance between 2 consecutive points and just need to compute index which is log(window) far from initial point. Working example is as follows:

@jit(nopython=True, parallel=True)
def smoothing_function2(x,y, window=2, pad = 1):
    len_x    = len(x)
    max_x    = np.max(x)
    xoutmid  = np.full(len_x, np.nan)
    xoutmean = np.full(len_x, np.nan)
    yout     = np.full(len_x, np.nan)

    f_idx = int(len(x)*np.log10(window)/(np.log10(x[-1])-np.log10(x[0])))

    for i in prange(len_x):
        if window*x[i] < max_x:
            e = min(i f_idx, len_x-1)
            yout[i]     = np.nanmean(y[i:e])
            xoutmid[i]  = x[i]   np.log10(0.5) * (x[i] - x[e])
            xoutmean[i] = np.nanmean(x[i:e])
    return xoutmid, xoutmean, yout

f = lambda x: x**(-1.7)*2*np.random.rand(len(x))
x = np.logspace(np.log10(1e-5), np.log10(1), 1000)
xvals, yvals  = x, f(x)

res1 = smoothing_function(xvals, yvals, window=2, pad = 1)
res2 = smoothing_function2(xvals, yvals, window=2, pad = 1)

print([np.nansum((r1 - r2)**2) for r1, r2 in zip(res1, res2)])  # verify that all the outputs are same

%timeit res = smoothing_function(xvals, yvals, window=2, pad = 1)

%timeit res = smoothing_function2(xvals, yvals, window=2, pad = 1)

Output is the following:

[0.0, 0.0, 0.0]
337 µs ± 59.6 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
49.1 µs ± 2.6 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

Which verifies that both functions return same output but smoothing_function2 is ~6.8x faster. If x is not restricted to be in logspace, you can still use the properties of whatever space you are using to get a similar improvement. There could be more ways to improve this further, it depends on what your target is. You could also try implementing this in C or Cython.

CodePudding user response：

@Abhinav's solution works perfectly. Another very slightly faster solution is this one:

from numba import prange, jit
@jit(nopython=True, parallel=True)      
def smoothing_function(x,y, window=2, pad = 1):
    def bisection(array,value):
        '''Given an ``array`` , and given a ``value`` , returns an index j such that ``value`` is between array[j]
        and array[j 1]. ``array`` must be monotonic increasing. j=-1 or j=len(array) is returned
        to indicate that ``value`` is out of range below and above respectively.'''
        n = len(array)
        if (value < array[0]):
            return -1
        elif (value > array[n-1]):
            return n
        jl = 0# Initialize lower
        ju = n-1# and upper limits.
        while (ju-jl > 1):# If we are not yet done,
            jm=(ju jl) >> 1# compute a midpoint with a bitshift
            if (value >= array[jm]):
                jl=jm# and replace either the lower limit
            else:
                ju=jm# or the upper limit, as appropriate.
            # Repeat until the test condition is satisfied.
        if (value == array[0]):# edge cases at bottom
            return 0
        elif (value == array[n-1]):# and top
            return n-1
        else:
            return jl

    len_x    = len(x)
    max_x    = np.max(x)
    xoutmid  = np.full(len_x, np.nan)
    xoutmean = np.full(len_x, np.nan)
    yout     = np.full(len_x, np.nan)
    
    for i in prange(len_x):
        x0 = x[i]
        xf = window*x0
        
        if xf < max_x:
            #e = np.where(x  == x[np.abs(x - xf).argmin()])[0][0]
            e = bisection(x,xf)
            if e<len_x:
                yout[i]     = np.nanmean(y[i:e])
                xoutmid[i]  = x0   np.log10(0.5) * (x0 - x[e])
                xoutmean[i] = np.nanmean(x[i:e])

    return xoutmid, xoutmean, yout