remove duplicate rows in a numpy array-CodePudding

I have a numpy array:

  arr = array([[991.4, 267.3, 192.3],
               [991.4, 267.4, 192.3],
               [991.4, 267.4, 192.3],
               ...,
               [993.5, 268. , 192.6],
               [993.5, 268. , 192.6],
               [993.5, 268.1, 192.6]])

you can see there are some duplicates in this.

I have tried arr = np.unique(arr) but that returns:

array([192.3, 192.4, 192.5, 192.6, 266.6, 266.7, 266.8, 266.9, 267. ,
       267.1, 267.2, 267.3, 267.4, 267.5, 267.6, 267.7, 267.8, 267.9,
       268. , 268.1, 268.2, 268.3, 268.4, 268.5, 268.6, 268.7, 268.8,
       991.4, 991.5, 991.6, 991.7, 991.8, 991.9, 992. , 992.1, 992.2,
       992.3, 992.4, 992.5, 992.6, 992.7, 992.8, 992.9, 993. , 993.1,
       993.2, 993.3, 993.4, 993.5])

I need to retain the nested nature of the array, so compare each nested array to the other nested array, only then remove the duplicates, i.e.:

[991.4, 267.3, 192.3],
[991.4, 267.4, 192.3],
[991.4, 267.4, 192.3],

In the above there are 2 unique rows, after filtering it should be:

[991.4, 267.3, 192.3],
[991.4, 267.4, 192.3],

CodePudding user response：

new_data = np.unique(arr, axis=0)

I believe this should help as we only need to remove duplicate rows.

So providing additional parameter axis = 0 (row) and 1 (column)

CodePudding user response：

To remove duplicate rows in a NumPy array, you can use the unique function along with the axis parameter and the return_index parameter. The axis parameter specifies the axis along which the unique elements are computed, and the return_index parameter specifies whether to return the indices of the unique elements.