Home > OS >  Broadcasted row membership between 2d arrays
Broadcasted row membership between 2d arrays

Time:12-11

Suppose we have the following 2d arrays:

>>> A
array([[1, 1],
       [2, 2],
       [3, 1]])

>>> B
array([[2, 1],
       [1, 2],
       [3, 1],
       [4, 2]])

I want to test the membership of the rows of A in the rows of B. For a single row of A we can test it's membership in B with:

np.any(np.all(A[index] == B, axis=1))

I want to do this for all rows of A at once without looping over the indices. The result should be:

desired_result = array([False, False, True])

How do we retrieve this result in a broadcasted way (without looping over rows of A)?

CodePudding user response:

As you suspected correctly, you can use broadcasting to compare each row of A to every row of B in a vectorized fashion:

out = (A == B[:, None]).all(axis=-1).any(axis=0)

>>> out
array([False, False,  True])

Explanation

To better understand how this works, let's use a modified problem:

A = np.array([
    [4, 2],
    [1, 1],
    [2, 2],
    [3, 1]])

B = np.array([
    [2, 1],
    [4, 2],
    [1, 2],
    [3, 1],
    [4, 2]])

where we expect to find A[0] ([4, 2]) at rows 1 and 4 in B. Then:

>>> (A == B[:, None]).all(axis=-1)
array([[False, False, False, False],
       [ True, False, False, False],
       [False, False, False, False],
       [False, False, False,  True],
       [ True, False, False, False]])

Shows that A[0] == B[1] and also A[0] == B[4] (first column), and that A[3] == B[3] (last column).

At this point, just .any(axis=0) finishes the job to produce the required result.

  • Related