I get this error when trying to do map this function over the numpy array:
>>> a = np.array([1, 2, 3, 4, 5])
>>> g = lambda x: 0 if x % 2 == 0 else 1
>>> g(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 1, in <lambda>
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
I was expecting result array([ 1, 0, 1, 0, 1])
When it works fine in this case:
>>> f = lambda x: x ** 2
>>> f(a)
array([ 1, 4, 9, 16, 25])
What can I do to map function g over the array a
faster than a for loop, preferably using some of numpy's faster code?
CodePudding user response:
This has problems:
a = np.array([1, 2, 3, 4, 5])
g = lambda x: 0 if x % 2 == 0 else 1
g(a)
A lambda is essentially just an unnamed function, which you happen to be naming here, so you might as well:
def g(x):
return 0 if x % 2 == 0 else 1
But that's still a bit odd, since taking an integer modulo 2 already is 0 or 1, so this would be the same (when applied to integers, which is what you're looking to do):
def g(x):
return x % 2
At which point you have to wonder if a function is needed at all. And it isn't, this works:
a = np.array([1, 2, 3, 4, 5])
a % 2
However, note that the mistake you made is that f = lambda x: x ** 2
followed by f(a)
works not because it applies the operation to each element - it applies the operation to the array, and the array supports spreading of the operation to its elements for raising to a power, just like it does for the modulo operator, which is why a % 2
works.
Result:
array([1, 0, 1, 0, 1], dtype=int32)
Note that this type of spreading isn't something that generally works - you shouldn't expect Python to just do the spreading when needed for any data type (like a list or set). It's a feature of numpy
's implementation of arrays, the operations have been defined on the array and implemented to spread the operation over the elements.
CodePudding user response:
You can execute mathematical some operations such as exponents on entire numpy arrays so you're doing the equivalent of np.array([ 1, 2, 3, 4, 5])**2
. But you cannot use the modulus operator on a numpy array hence giving you an error.
The lambda function is being applied to the entire array here not each individual element.
You can use np.vectorize
here instead:
modulus = np.vectorize(lambda x: 0 if x % 2 == 0 else 1)
modulus(a)
CodePudding user response:
Lets decompose this. The general form of your ternary expression is x if C else y
, with C
being the comparison. Lets pull out just that comparison to see what we get:
>>> a = np.array([1, 2, 3, 4, 5])
>>> a % 2 == 0
array([False, True, False, True, False])
That mod and comparison gives a new array where each value as been moded and compared. Now you throw in an if
(in this case if array([False, True, False, True, False])
). But should this if
be true if all of the array elements are True
or maybe just a single one? What if the array is empty, is that different? That's why pandas has the methods listed in the error message - you have to decide what True
is.
But not in this case. You really just wanted the base 2 modulus of each element and you don't have to work so hard to get it.
>>> a = np.array([1, 2, 3, 4, 5])
>>> a % 2
array([1, 0, 1, 0, 1])
There's your answer!