Home > database >  Numpy.unique on 3d array with axis=2 but not working as expected
Numpy.unique on 3d array with axis=2 but not working as expected

Time:12-23

Consider the following code, when axis=2, it should remove the duplicate of [1 1] to [1], but not. I wonder why it doesn't do unique operation on the 3rd axis.

arr = np.array([[[1,1], [1,1], [1,1]],
         [[7,1], [10,1], [10,1]],
         [[1,1], [1,1], [1,1]]])

print(np.unique(arr, axis=0))
print("----------------")
print(np.unique(arr, axis=1))
print("----------------")
print(np.unique(arr, axis=2))

I tried with many other examples, and it still not working on the 3rd axis.

CodePudding user response:

Note this from the documentation (citing help(np.unique)):

The axis to operate on. If None, ar will be flattened. If an integer, the subarrays indexed by the given axis will be flattened and treated as the elements of a 1-D array with the dimension of the given axis […]

When an axis is specified the subarrays indexed by the axis are sorted. […] The result is that the flattened subarrays are sorted in lexicographic order starting with the first element.

So in your case it will try to sort and compare the sub-arrays arr[:, :, 0].flatten() which is [ 1, 1, 1, 7, 10, 10, 1, 1, 1] with arr[:, :, 1].flatten() which is [1, 1, 1, 1, 1, 1, 1, 1, 1].

These are obviously not the same so no change is made except that the second is sorted before the first in a lexicographical comparison.

I assume what you wanted it to do is getting rid of the duplicate [1, 1] entries. However, np.unique cannot really work that way because these are arrays not lists. That behavior would result in different number of entries in arr[0] compared to arr[1] and that obviously cannot work.

  • Related