Assume I have a function that is defined for two float values, and this function is rather complex that is not easy to be modified. Now I have two same 2D arrays, say $X_{n \times m}, Y_{n \times m}$, I need to carry out the function on each element on the 2D array $x_{ij}, y_{ij}$. How can I speed this work with respect to the two for loops?
The following is the general code, in which the function has been simplified to be a summation:
def func(x, y):
return x y
X = np.random.rand(100, 100)
Y = np.random.rand(100, 100)
Z = np.zeros((100, 100))
for i in range(100):
for j in range(100):
z = func(X[i, j], Y[i, j])
Z[i, j] = z
CodePudding user response:
Methods
-
- Using numpy vectorize
-
- Using Numpy functions
Method 1--Numpy Vectorize
Provides 4X speed up (3.14 ms vs. 12.1 ms)
%%timeit
Z = np.zeros((100, 100))
for i in range(100):
for j in range(100):
z = func(X[i, j], Y[i, j])
Z[i, j] = z
12.2 ms ± 901 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Using Vectorize
%%timeit
vec_func = np.vectorize(func) # vectorized version of function
Z2 = vec_func(X, Y) # use vectorized version on X, Y
3.14 ms ± 192 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Method 2--Using Numpy vectorized functions
If the more complex function can be done with numpy functions such as add, subtract, sqrt, exp, etc.
- Provides ~775 speedup for simple function example (15.5 us vs. 12.1 ms)
Code
def func(X, Y):
return np.add(X, Y)
%timeit Z3 = func(X, Y)
15.5 µs ± 1.6 µs per loop (mean ± std. dev. of 7 runs, 100000 loops each)
CodePudding user response:
You could use np.vectorize
or use Python lists and floats instead of NumPy:
4.82 ms original
1.78 ms vectorized
2.02 ms mapped
1.04 ms mapped2
Code (Try it online!):
def original(func, X, Y):
Z = np.zeros((100, 100))
for i in range(100):
for j in range(100):
z = func(X[i, j], Y[i, j])
Z[i, j] = z
return Z
def vectorized(func, X, Y):
return np.vectorize(func)(X, Y)
def mapped(func, X, Y):
return np.array([
[*map(func, x, y)]
for x, y in zip(X.tolist(), Y.tolist())
])
def mapped2(func, X, Y):
return [
[*map(func, x, y)]
for x, y in zip(X2, Y2)
]
from timeit import repeat
import numpy as np
fs = original, vectorized, mapped, mapped2
def func(x, y):
return x y
X = np.random.rand(100, 100)
Y = np.random.rand(100, 100)
X2 = X.tolist()
Y2 = Y.tolist()
expect = fs[0](func, X, Y)
for f in fs:
print((f(func, X, Y) == expect).all())
for _ in range(3):
for f in fs:
t = min(repeat(lambda: f(func, X, Y), number=10)) / 10
print('%.2f ms ' % (t * 1e3), f.__name__)
print()