I would like to find a numpy array with don't cares like:
b = np.array(
[
[0,0,-1,-1]
]
, dtype=np.int8
)
where -1
is the don't care, and find it in arrays like:
a = np.array(
[
[1,2,0,0],
[0,1,2,0],
[0,0,1,2],
[2,0,0,1],
[3,4,0,0],
[0,3,4,0],
[0,0,3,4],
[4,0,0,3]
]
, dtype=np.int8
)
and return the row index's 2 and 6
for the above sample
the a
array are normally around 1000 rows shape(~1000, 4)
note the b
array can have don't cares any where or none, examples:
b = np.array(
[
[0,3,4,-1]
]
, dtype=np.int8
)
# --OR--
b = np.array(
[
[2,-1,0,1]
]
, dtype=np.int8
)
# --OR--
b = np.array(
[
[2,-1,-1,-1]
]
, dtype=np.int8
)
# etc...
CodePudding user response:
You could replace values in the main array with -1
where you have -1
in your sub array. Then you can just find where b==a
np.where(np.all(np.where((b==-1),-1,a)==b, axis=1))
Output
(array([2, 6], dtype=int64),)
CodePudding user response:
I had a much more brute force answer. The where/all thing is better.
import numpy as np
a = np.array(
[
[1,2,0,0],
[0,1,2,0],
[0,0,1,2],
[2,0,0,1],
[3,4,0,0],
[0,3,4,0],
[0,0,3,4],
[4,0,0,3]
]
, dtype=np.int8
)
def fuzzyfind( haystack, needle ):
c = np.ones( haystack.shape[0] ) == 1
for i,v in enumerate(needle):
if v >= 0:
c = c & (a[:,i] == v)
return np.argwhere(c)
print( fuzzyfind( a, [0, 0, -1, -1] ))
print( fuzzyfind( a, [0, 3, 4, -1] ))
print( fuzzyfind( a, [2, -1, 0, 1] ))
print( fuzzyfind( a, [2, -1, -1, -1] ))
Output:
[[2]
[6]]
[[5]]
[[3]]
[[3]]
CodePudding user response:
Here's my solution
- Extract the sub-array of A without the dont-care cols
- Use np.where to match the rows from A to B
- Use np.all to get indices that match perfectly along the rows, axis=1
- You could use the
match_indices
to re-index into a.
# If we have the following:
b = np.array([0,0,-1,-1])
# We can extract the columns that are filtered on from b:
wildcard = -1
filter_cols = [i for i, val in enumerate(b) if val != wildcard]
b_sub = b[filter_cols]
a_sub = a[:, filter_cols]
# Now we can filter on a_sub to get the indices that match
matches = np.where(a_sub == b_sub, True, False)
match_indices = np.all(matches, axis=1)
match_indices
should have the answer you need!
CodePudding user response:
You can try the following:
np.nonzero(np.all((a == b) | (b == -1), axis=1))