for exmaple, I have two arrays: 'x' for actual values, 'I' for their index in array.
x = [1, 2, 3, 3, 2, 4]
I = [0, 1, 2, 3, 4, 5]
in 'x', the 4th value is duplicate one of 3rd value and the 5th value is duplicate one of 2nd value
Therefore, I want to generate the
y = [0, 1, 2, 2, 1, 5]
(containing first occurrence indicies of original array values)
How can I do this efficiently using python numpy methods?
CodePudding user response:
You could do:
u, idx, inv = np.unique(x, return_inverse=True, return_index=True)
>>> idx
array([0, 1, 2, 5], dtype=int64)
>>> inv
array([0, 1, 2, 2, 1, 3], dtype=int64)
>>> idx[inv]
array([0, 1, 2, 2, 1, 5], dtype=int64)
No, after it's clear, read the docs of np.unique
:
inv
are the indices of the unique array that can be used to reconstructx
.idx
are the indices ofx
that result in the unique array.
So you just take the indices of x
that result in the unique array and reconstruct them instead of x