I am trying to do something like the following in NumPy:
import numpy as np
def f(x):
return x[0] x[1]
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
result = np.vectorize(f)(X)
with the expected result being array([[0, 1, 2], [1, 2, 3], [2, 3, 4]])
, but it returns the following error:
2
3 def f(x):
----> 4 return x[0] x[1]
5
6 X1 = np.array([0, 1, 2])
IndexError: invalid index to scalar variable
This is because it tries to apply f
to all 18 scalar elements of the mesh grid, whereas I want it applied to 9 pairs of 2 scalars. What is the correct way to do this?
Note: I am aware this code will work if I do not vectorize f
, but this is important because f
can be any function, e.g. it could contain an if statement which throws value error without vectorizing.
CodePudding user response:
If you persist to use numpy.vectorize
you need to define signature
when defining vectorize on function.
import numpy as np
def f(x):
return x[0] x[1]
# Or
# return np.add.reduce(x, axis=0)
X1 = np.array([0, 1, 2])
X2 = np.array([0, 1, 2])
X = np.meshgrid(X1, X2)
# np.asarray(X).shape -> (2, 3, 3)
# shape of the desired result is (3, 3)
f_vec = np.vectorize(f, signature='(n,m,m)->(m,m)')
result = f_vec(X)
print(result)
Output:
[[0 1 2]
[1 2 3]
[2 3 4]]
CodePudding user response:
For the function you mentioned in the comments:
f = lambda x: x[0] x[1] if x[0] > 0 else 0
You can use np.where
:
def f(x):
return np.where(x > 0, x[0] x[1], 0)
# np.where(some_condition, value_if_true, value_if_false)
Numpy was designed with vectorization in mind -- unless you have some crazy edge-case there's almost always a way to take advantage of Numpy's broadcasting and vectorization. I strongly recommend seeking out vectorized solutions before giving up so easily and resorting to using for
loops.
CodePudding user response:
If you are too lazy, or ignorant, to do are "proper" 'vectorization', you can use np.vectorize
. But you need to take time to really read its docs. It isn't magic. It can be useful, especially if you need to take advantage of broadcasting, and the function, some reason or other, only accepts scalars.
Rewriting your function to work with scalar inputs (though it also works fine with arrays, in this case):
In [91]: def foo(x,y): return x y
...: f = np.vectorize(foo)
With scalar inputs:
In [92]: f(1,2)
Out[92]: array(3)
With 2 arrays (a (2,1) and (3,)), returning a (2,3):
In [93]: f(np.array([1,2])[:,None], np.arange(1,4))
Out[93]:
array([[2, 3, 4],
[3, 4, 5]])
Samething with meshgrid
:
In [94]: I,J = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij')
In [95]: I
Out[95]:
array([[1, 1, 1],
[2, 2, 2]])
In [96]: J
Out[96]:
array([[1, 2, 3],
[1, 2, 3]])
In [97]: f(I,J)
Out[97]:
array([[2, 3, 4],
[3, 4, 5]])
Or meshgrid arrays as defined in [93]:
In [98]: I,J = np.meshgrid(np.array([1,2]), np.arange(1,4),indexing='ij', sparse=True)
In [99]: I,J
Out[99]:
(array([[1],
[2]]),
array([[1, 2, 3]]))
But in a true vectorized sense, you can just add the 2 arrays:
In [100]: I J
Out[100]:
array([[2, 3, 4],
[3, 4, 5]])
The first paragraph of np.vectorize
docs (my emphasis):
Define a vectorized function which takes a nested sequence of objects or numpy arrays as inputs and returns a single numpy array or a tuple of numpy arrays. The vectorized function evaluates
pyfunc
over successive tuples of the input arrays like the python map function, except it uses the broadcasting rules of numpy.