Find numpy array with don't cares in a numpy array-CodePudding

I would like to find a numpy array with don't cares like:

b = np.array(
    [
        [0,0,-1,-1]
    ]
    , dtype=np.int8
)

where -1 is the don't care, and find it in arrays like:

a = np.array(
    [
        [1,2,0,0],
        [0,1,2,0],
        [0,0,1,2],
        [2,0,0,1],
        [3,4,0,0],
        [0,3,4,0],
        [0,0,3,4],
        [4,0,0,3]
    ]
    , dtype=np.int8
)

and return the row index's 2 and 6 for the above sample

the a array are normally around 1000 rows shape(~1000, 4)

note the b array can have don't cares any where or none, examples:

b = np.array(
    [
        [0,3,4,-1]
    ]
    , dtype=np.int8
)

# --OR--

b = np.array(
    [
        [2,-1,0,1]
    ]
    , dtype=np.int8
)

# --OR--

b = np.array(
    [
        [2,-1,-1,-1]
    ]
    , dtype=np.int8
)
# etc...

CodePudding user response：

You could replace values in the main array with -1 where you have -1 in your sub array. Then you can just find where b==a

np.where(np.all(np.where((b==-1),-1,a)==b, axis=1))

Output

(array([2, 6], dtype=int64),)

CodePudding user response：

I had a much more brute force answer. The where/all thing is better.

import numpy as np

a = np.array(
    [
        [1,2,0,0],
        [0,1,2,0],
        [0,0,1,2],
        [2,0,0,1],
        [3,4,0,0],
        [0,3,4,0],
        [0,0,3,4],
        [4,0,0,3]
    ]
    , dtype=np.int8
)

def fuzzyfind( haystack, needle ):
    c = np.ones( haystack.shape[0] ) == 1
    for i,v in enumerate(needle):
        if v >= 0:
            c = c & (a[:,i] == v)
    return np.argwhere(c)

print( fuzzyfind( a, [0, 0, -1, -1] ))
print( fuzzyfind( a, [0, 3, 4, -1] ))
print( fuzzyfind( a, [2, -1, 0, 1] ))
print( fuzzyfind( a, [2, -1, -1, -1] ))

Output:

[[2]
 [6]]
[[5]]
[[3]]
[[3]]

CodePudding user response：

Here's my solution

Extract the sub-array of A without the dont-care cols
Use np.where to match the rows from A to B
Use np.all to get indices that match perfectly along the rows, axis=1
You could use the match_indices to re-index into a.

# If we have the following:
b = np.array([0,0,-1,-1])

# We can extract the columns that are filtered on from b:
wildcard = -1
filter_cols = [i for i, val in enumerate(b) if val != wildcard]

b_sub = b[filter_cols]
a_sub = a[:, filter_cols]

# Now we can filter on a_sub to get the indices that match
matches = np.where(a_sub == b_sub, True, False)
match_indices = np.all(matches, axis=1)

match_indices should have the answer you need!

CodePudding user response：

You can try the following:

np.nonzero(np.all((a == b) | (b == -1), axis=1))