Suppose we have the following 2d arrays:
>>> A
array([[1, 1],
[2, 2],
[3, 1]])
>>> B
array([[2, 1],
[1, 2],
[3, 1],
[4, 2]])
I want to test the membership of the rows of A in the rows of B. For a single row of A we can test it's membership in B with:
np.any(np.all(A[index] == B, axis=1))
I want to do this for all rows of A at once without looping over the indices. The result should be:
desired_result = array([False, False, True])
How do we retrieve this result in a broadcasted way (without looping over rows of A)?
CodePudding user response:
As you suspected correctly, you can use broadcasting to compare each row of A
to every row of B
in a vectorized fashion:
out = (A == B[:, None]).all(axis=-1).any(axis=0)
>>> out
array([False, False, True])
Explanation
To better understand how this works, let's use a modified problem:
A = np.array([
[4, 2],
[1, 1],
[2, 2],
[3, 1]])
B = np.array([
[2, 1],
[4, 2],
[1, 2],
[3, 1],
[4, 2]])
where we expect to find A[0]
([4, 2]
) at rows 1 and 4 in B
. Then:
>>> (A == B[:, None]).all(axis=-1)
array([[False, False, False, False],
[ True, False, False, False],
[False, False, False, False],
[False, False, False, True],
[ True, False, False, False]])
Shows that A[0] == B[1]
and also A[0] == B[4]
(first column), and that A[3] == B[3]
(last column).
At this point, just .any(axis=0)
finishes the job to produce the required result.