Vectorize "is in"-CodePudding

I try to build samples of m vectors (with integer entries) together with m evaluations. A vector x of shape (n,1) is evaluated to y=1 if one of its entries is the number 2. Otherwise, it is evaluated as y=0.
In order to deal with many such vectors and evaluations, the sample vectors are stored in an (n,m)-shaped ndarray and the evaluations are stored in a (1,m)-shaped ndarray. See the code:

import numpy as np

n = 10 # number of entries in each sample vector
m = 1000 # number of samples

X = np.random.randint(-10, 10, (n, m))
Y = []
for i in range(m):
    if 2 in X[:, i]:
        Y.append(1)
    else:
        Y.append(0)
Y = np.array(Y).reshape((1,-1))
assert (Y.shape == (1,m))

How can I vectorize the computation of Y? I tried to replace the initialization/computation of X and Y by the following:

X = np.random.randint(-10,10,(n,m))
Y = np.apply_along_axis(func1d=lambda x: 1 if 2 in x else 0, axis=0, arr=X)

A few executions suggested that this is most times even a bit slower than my first approach. (Acutally this anser starts by saying that numpy.apply_along_axis was not for speed. Also I am not aware of how good lambda is in this context.)

Is there a way to vectorize the computation of Y, i.e. a way to assign a value 1 or 0 to each column, depending on whether that column contains the element 2?

CodePudding user response：

When using Numpy array and logical statement, it does a lot of optimisations without the user having to manually vectorise tasks. The following code reaches the same solution:

# assign logical 1 where element == 2 everywhere in the array X,
# then, for each column (axis = 0), if any element == 1 assign column logical 1
Y = (X == 2).any(axis = 0).reshape(1, -1)
print(Y.shape)

using timeit to assess execution times:

loop method: 3240 microseconds per run

numpy method: 6.57 microseconds per run

If you're interested, you could see if using other vectorisation methods, such as np.vectorise, improves the time further though I'm quite sure the underlying Numpy optimisations perform their own vectorisation at CPU instruction level (SIMD) by default.

Bottom line is when using numpy always try to find a solution using logical arrays and numpy functions/methods as they're already very heavily optimised within the compiled binaries, and any python functions used to manipulate, access, or iterate the data slows the execution speed dramatically.

By the way, the most common way to get faster for loop execution to build a list of outputs such as you've done is to use list comprehension:

Y = np.array([2 in X[:, i] for i in range(m)]).reshape((1, -1))

which executes in 3070 microseconds per loop.