Home > Net >  Python : equivalent of Matlab ismember on rows for large arrays
Python : equivalent of Matlab ismember on rows for large arrays

Time:03-09

I can't find an efficient way to conduct Matlab's "ismember(a,b,'rows')" with Python where a and b are arrays of size (ma,2) and (mb,2) respectively and m is the number of couples.

The ismember module (https://pypi.org/project/ismember/) crashes because at some point i.e. when doing np.all(a[:, None] == b, axis=2).any(axis=1) it needs to create an array of size (ma,mb,2) and it is too big. Moreover, even when the function works (because arrays are small enough), it is about a 100times slower than in Matlab. I guess it is because Matlab uses a built-in mex function. Why python does not have what I would think to be such an important function ? I use it countless times in my calculations...

ps : the solution proposed here Python version of ismember with 'rows' and index does not correspond to the true matlab's ismember function since it does not work element by element i.e. it does not verify that a couple of values of 'a' exists in 'b' but only if values of each column of 'a' exist in each columns of 'b'.

CodePudding user response:

You can use np.unique(array,axis=0) in order to find the identical row of an array. So with this function you can simplify your 2D problem to a 1D problem which can be easily solve with np.isin():

import numpy as np

# Dummy example array:
a = np.array([[1,2],[3,4]])
b = np.array([[3,5],[2,3],[3,4]])

# ismember_row function, which rows of a are in b:
def ismember_row(a,b):
    # Get the unique row index
    _, rev = np.unique(np.concatenate((b,a)),axis=0,return_inverse=True)
    # Split the index
    a_rev = rev[len(b):]
    b_rev = rev[:len(b)]
    # Return the result:
    return np.isin(a_rev,b_rev)

res = ismember_row(a,b)
# res = array([False,  True])
  • Related