Home > Software engineering >  Numpy - most efficient way to find 2d array in another 2darray
Numpy - most efficient way to find 2d array in another 2darray

Time:08-29

I'm new with numpy, trying to understand how to search for 2d array in another 2d array. I don't need indexes, just True/False

For example I've an array with shape 10x10, all ones and somewhere it has 2x2 zeroes:

ar = np.array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
 [1, 1, 1, 1, 1, 1, 1, 1, 1, 1],
 [1, 1, 0, 1, 1, 0, 1, 1, 0, 1],
 [1, 1, 0, 1, 1, 0, 1, 1, 0, 1],
 [1, 1, 1, 1, 0, 0, 1, 1, 1, 1],
 [1, 1, 1, 1, 0, 0, 1, 1, 1, 0],
 [1, 1, 1, 1, 1, 0, 1, 1, 1, 1],
 [1, 1, 1, 1, 0, 0, 1, 1, 1, 1],
 [1, 1, 0, 1, 1, 0, 1, 1, 1, 2],
 [1, 1, 0, 1, 1, 0, 1, 1, 1, 1]]
)

and I have another array I want to find

ar2 = np.zeros((2,2))

I tried functions like isin and where, but they all search for any elements, not for entire shape of array.

Here's what I've come to - iterate over rows and cols, slice 2x2 array and compare it with zeroes array:

for r, c in np.ndindex(ar.shape):
    if r-1>=0 and c-1>=0 and np.array_equal(ar[r - 1:r   1, c - 1:c   1], ar2):
        print(f'found it {r}:{c}')

I'm not sure if this is the best solution, but at least it works. Maybe there is some easier and faster way to search for 2x2 zeroes?

CodePudding user response:

I think using scikit image library can be one of the best ways to do so:

from skimage.util import view_as_windows

view_ = view_as_windows(ar, (2, 2))
res_temp = np.all((view_ == ar2[None, ...]), (-2, -1))
result = np.nonzero(res_temp)

# (array([4], dtype=int64), array([4], dtype=int64))

This will get indices. For same result as your code, indices must be added by one.

CodePudding user response:

Based on this answer by Brenlla, I made this function which works with 2d arrays:

def find_array_in_array_2d(ar, ar2):

    # Find all matches with first element of ar2
    match_idx = np.nonzero(ar[:-ar2.shape[0] 1, :-ar2.shape[1] 1] == ar2[0, 0])

    # Check remaining indices of ar2
    for i, j in list(np.ndindex(ar2.shape))[1:]:

        # End if no possible matches left
        if len(match_idx[0]) == 0:
            break

        # Index into ar offset by i, j
        nz2 = (match_idx[0]   i, match_idx[1]   j)

        # Find remaining matches with selected element
        to_keep = np.nonzero(ar[nz2] == ar2[i, j])[0]
        match_idx = match_idx[0][to_keep], match_idx[1][to_keep]

    return match_idx

print(find_array_in_array_2d(ar, ar2))
(array([4]), array([4]))

I think it will be faster than your method if ar is big and ar2 is small and especially when ar does not contain many values which are also in ar2.

  • Related