Home > Enterprise >  Remove duplicate index in 2D array
Remove duplicate index in 2D array

Time:07-03

I have this 2D numpy array here:

arr = np.array([[1,2],
                [2,2],
                [3,2],
                [4,2],
                [5,3]])

I would like to delete all duplicates corresponding to the previous index at index 1 and get an output like so:

np.array([[1,2],
          [5,3]])

However, when I try my code it errors. Here is my code:

for x in range(0, len(arr)):
    if arr[x][1] == arr[x-1][1]:
        arr = np.delete(arr, x, 0)

>>> IndexError: index 3 is out of bounds for axis 0 with size 2

CodePudding user response:

Rather than trying to delete from the array, you can use np.unique to find the indices of first occurrences of the unique values in the second columns and use that to pull those values out:

import numpy as np   

arr = np.array([[1,2],
                [2,2],
                [3,2],
                [4,2],
                [5,3]])

u, i = np.unique(arr[:,1], return_index=True)

arr[i]    
# array([[1, 2],
#       [5, 3]])
  • Related