I have a 2d numpy array that I use for indexing (by making it a tuple of two numpy arrays, see below). From that array I now want to remove all duplicate index pairs out of the original array and in a new array (it can be multiple new arrays without or one with possible duplicates if one pair occured more than twice):
>>> import numpy as np
>>> L = 2
>>> indices = tuple(np.random.randint(0, L, (2, L**2)))
>>> indices
(array([0, 1, 0, 1]), array([1, 0, 0, 0]))
What I want to get is:
indices = (array([0, 1, 0]), array([1, 0, 0]))
indices_2 = (array([1]), array([0]))
CodePudding user response:
You can use:
L = 2
indices = tuple(np.random.randint(0, L, (2, L**2)))
# stack the two arrays to handle simultaneously
# one can also use the original array without converting to tuple
a = np.vstack(indices)
# array([[0, 1, 0, 1],
# [1, 0, 0, 0]])
# get unique values and indices of the first occurrences
# expand the indices to a tuple of arrays
(*indices,), idx = np.unique(a, axis=1, return_index=True)
# ((array([0, 0, 1]), array([0, 1, 0])), array([2, 0, 1]))
# remove the first occurrences to keep only the duplicates
# again, convert the 2D indices into a tuple of arrays
(*indices_2,) = np.delete(a, idx, axis=1)
# (array([1]), array([0]))