How can I speed the function on element-wise on two 2D array?-CodePudding

Assume I have a function that is defined for two float values, and this function is rather complex that is not easy to be modified. Now I have two same 2D arrays, say $X_{n \times m}, Y_{n \times m}$, I need to carry out the function on each element on the 2D array $x_{ij}, y_{ij}$. How can I speed this work with respect to the two for loops?

The following is the general code, in which the function has been simplified to be a summation:

def func(x, y):
     return x   y

X = np.random.rand(100, 100)
Y = np.random.rand(100, 100)

Z = np.zeros((100, 100))
for i in range(100):
    for j in range(100):
       z = func(X[i, j], Y[i, j])
       Z[i, j] = z

CodePudding user response：

Methods

1. Using numpy vectorize
1. Using Numpy functions

Method 1--Numpy Vectorize

Provides 4X speed up (3.14 ms vs. 12.1 ms)

%%timeit

Z = np.zeros((100, 100))
for i in range(100):
    for j in range(100):
       z = func(X[i, j], Y[i, j])
       Z[i, j] = z

12.2 ms ± 901 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Using Vectorize

%%timeit

vec_func = np.vectorize(func)  # vectorized version of function

Z2 = vec_func(X, Y)            # use vectorized version on X, Y

3.14 ms ± 192 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Method 2--Using Numpy vectorized functions

If the more complex function can be done with numpy functions such as add, subtract, sqrt, exp, etc.

Provides ~775 speedup for simple function example (15.5 us vs. 12.1 ms)

Code

def func(X, Y):
    return np.add(X, Y)

%timeit Z3 = func(X, Y)

15.5 µs ± 1.6 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)

CodePudding user response：

You could use np.vectorize or use Python lists and floats instead of NumPy:

4.82 ms  original
1.78 ms  vectorized
2.02 ms  mapped
1.04 ms  mapped2

Code (Try it online!):

def original(func, X, Y):
    Z = np.zeros((100, 100))
    for i in range(100):
        for j in range(100):
           z = func(X[i, j], Y[i, j])
           Z[i, j] = z
    return Z

def vectorized(func, X, Y):
    return np.vectorize(func)(X, Y)

def mapped(func, X, Y):
    return np.array([
        [*map(func, x, y)]
        for x, y in zip(X.tolist(), Y.tolist())
    ])

def mapped2(func, X, Y):
    return [
        [*map(func, x, y)]
        for x, y in zip(X2, Y2)
    ]


from timeit import repeat
import numpy as np

fs = original, vectorized, mapped, mapped2

def func(x, y):
     return x   y

X = np.random.rand(100, 100)
Y = np.random.rand(100, 100)
X2 = X.tolist()
Y2 = Y.tolist()

expect = fs[0](func, X, Y)
for f in fs:
    print((f(func, X, Y) == expect).all())

for _ in range(3):
    for f in fs:
        t = min(repeat(lambda: f(func, X, Y), number=10)) / 10
        print('%.2f ms ' % (t * 1e3), f.__name__)
    print()