I have a numpy array:
arr = array([[991.4, 267.3, 192.3],
[991.4, 267.4, 192.3],
[991.4, 267.4, 192.3],
...,
[993.5, 268. , 192.6],
[993.5, 268. , 192.6],
[993.5, 268.1, 192.6]])
you can see there are some duplicates in this.
I have tried arr = np.unique(arr)
but that returns:
array([192.3, 192.4, 192.5, 192.6, 266.6, 266.7, 266.8, 266.9, 267. ,
267.1, 267.2, 267.3, 267.4, 267.5, 267.6, 267.7, 267.8, 267.9,
268. , 268.1, 268.2, 268.3, 268.4, 268.5, 268.6, 268.7, 268.8,
991.4, 991.5, 991.6, 991.7, 991.8, 991.9, 992. , 992.1, 992.2,
992.3, 992.4, 992.5, 992.6, 992.7, 992.8, 992.9, 993. , 993.1,
993.2, 993.3, 993.4, 993.5])
I need to retain the nested nature of the array, so compare each nested array to the other nested array, only then remove the duplicates, i.e.:
[991.4, 267.3, 192.3],
[991.4, 267.4, 192.3],
[991.4, 267.4, 192.3],
In the above there are 2 unique rows, after filtering it should be:
[991.4, 267.3, 192.3],
[991.4, 267.4, 192.3],
CodePudding user response:
new_data = np.unique(arr, axis=0)
I believe this should help as we only need to remove duplicate rows.
So providing additional parameter axis = 0 (row) and 1 (column)
CodePudding user response:
To remove duplicate rows in a NumPy array, you can use the unique function along with the axis parameter and the return_index parameter. The axis parameter specifies the axis along which the unique elements are computed, and the return_index parameter specifies whether to return the indices of the unique elements.