Is there some efficient way to "double vectorize" a Numpy function?
Consider some function f
which is vectorized over its first 3 positional arguments; its implementation consists entirely of Numpy vectorized functions (arithmetic, trigonometry, et alia) which correctly implement broadcasting.
The first two arguments of f
are x
and y
, which represent some kind of input data. Its 3rd argument q
is a parameter that controls some aspect of the computation.
In my program, I have the following:
- Arrays
x
andy
that are 1-d arrays of the same length.x[i]
andy[i]
correspond to thei
th data point in a dataset. - Array
q
which is a 1-d array of different length.q[k]
corresponds to somek
th data point in a different collection.
I want to compute the value of f(x[i], y[i], q[k])
for any pair i, k
, collecting the results in a matrix.
That is, I want to perform a vectorized version of the following calculation:
result = np.empty((len(x), len(q))
for k in range(len(q)):
for i in range(len(x)):
result[i, k] = f(x[i], y[i], q[k])
The "singly-vectorized" version (over the i
index) would be:
result = np.empty((len(x), len(q))
for k in range(len(q)):
result[:, k] = f(x, y, q[k])
And this is what I currently use in my code.
Is there an efficient way to vectorize over both indexes, maybe using some broadcasting trick?
CodePudding user response:
Depending on the actual f
, I think I'd approach it like...
# set up example variables
N, M = 11, 13
x = np.random.normal(N)
y = np.random.normal(N)
q = np.random.normal(M)
# reshape for broadcasting
X = x[:, np.newaxis]
Y = y[:, np.newaxis]
Q = q[np.newaxis, :]
f(X, Y, Q)
then if f
is, I lack the proper term for it, ufunc
-like, it should broadcast nicely. If it isn't, and it's hard to tell without the actual "a bit complicated" implementation", you could make it so either via changing the implementation or possibly by numba
's vectorize
decorator, which is how I usually do that.
CodePudding user response:
You can vectorise the second "vectorised" multiplication operation using np.outer. It contains no for loops.
The following code contains a print statement with the "double-vectorised" version. I also fixed the minor typo in brackets in the creation of the empty results matrix.
import numpy as np
x = [0.1, 0.2, 0.3]
y = [0.4, 0.6, 0.8]
q = [0.4, 0.5, 0.6, 0.7]
result = np.empty((len(x),len(q)))
for k in range(len(q)):
for i in range(len(x)):
result[i, k] = np.arctan2(x[i], y[i]) * q[k]
print(result)
print(np.outer(np.arctan2(x, y) , q))
Results:
[[0.09799147 0.12248933 0.1469872 0.17148506]
[0.12870022 0.16087528 0.19305033 0.22522539]
[0.14350827 0.17938534 0.2152624 0.25113947]]
[[0.09799147 0.12248933 0.1469872 0.17148506]
[0.12870022 0.16087528 0.19305033 0.22522539]
[0.14350827 0.17938534 0.2152624 0.25113947]]
Hope this helps.