How to compare multidimensional two numpy arrays and delete elements from one if the element is also-CodePudding

I have two numpy arrays with shapes (n,3,2) and (n,2).

I want to delete the element or elements from the first array(so that it's going to have shape of (n-1),3,2 or (n-2),3,2 etc.) if these are also element of the second array.

What would be the best way to do something like that? Any help is really appreciated.

Let's say

array1 = 
[[[ 5.1  5. ]
  [ 6.2  4.4]
  [ 4.   6.3]]

 [[ 4.2  4.5]
  [ 4.4  5.3]
  [ 4.   6.3]]

 [[ 4.4  5.3]
  [ 5.1  5. ]
  [ 4.   6.3]]]


array2 =
[[ 4.2  4.5]
 [ 4.4  5.3]
 [ 4.5  4.8]
 [ 4.  6.3]]

as you can see all the three elements of array1[1] are also in the array2. So that i want to delete array1[1] from array1.

I tried this for and if loops but it does not work.

for i in range(0,len(array1),1):
   if(array1[i][0] in array2 and array1[i][1] in array2 and array1[i][2] in array2):
       
       array1 = np.delete(array1, [i], axis=0)

   else:
      continue
print(array1)

CodePudding user response：

One possible way to achieve this is to use np.isin function, which returns a boolean array indicating whether each element of one array is contained in another array. For example, np.isin(array1, array2) will return a boolean array of shape (n, 3, 2) with True values where the elements of array1 match the elements of array2.

Then, you can use np.all function along the last axis to check if all the elements of a subarray of array1 are in array2. For example, np.all(np.isin(array1, array2), axis=-1) will return a boolean array of shape (n, 3) with True values where all the elements of a subarray of shape (2,) are in array2.

Finally, you can use np.any function along the second axis to check if any of the subarrays of array1 are fully in array2. For example, np.any(np.all(np.isin(array1, array2), axis=-1), axis=1) will return a boolean array of shape (n,) with True values where any of the subarrays of shape (3, 2) are in array2.

To delete the elements from array1 that are in array2, you can use np.logical_not function to invert the boolean array, and use it as a mask to index array1. For example, array1[np.logical_not(np.any(np.all(np.isin(array1, array2), axis=-1), axis=1))] will return a new array with the elements of array1 that are not in array2.

Explanation

The logic behind this solution is to compare the elements of array1 and array2 at different levels of granularity, and use boolean operations to combine the results.

The np.isin function compares each element of array1 with the whole array2, and returns True if there is a match. This is the most basic level of comparison, and it does not take into account the order or the structure of the elements.

The np.all function along the last axis compares each subarray of shape (2,) of array1 with the whole array2, and returns True if all the elements of the subarray are in array2. This is a more refined level of comparison, and it takes into account the order of the elements within the subarray, but not the structure of the subarrays.

The np.any function along the second axis compares each subarray of shape (3, 2) of array1 with the whole array2, and returns True if any of the subarrays are fully in array2. This is the most specific level of comparison, and it takes into account the order and the structure of the subarrays.

The np.logical_not function inverts the boolean array, so that the True values become False and vice versa. This is useful to select the elements of array1 that are not in array2, since we want to delete the ones that are in array2.

The boolean array can be used as a mask to index array1, and return a new array with the selected elements. This is the final step of the solution, and it deletes the elements of array1 that are in array2.

Example

Here is a python code block that implements the solution and prints the result:

import numpy as np

array1 = np.array([[[ 5.1,  5. ],
                    [ 6.2,  4.4],
                    [ 4. ,  6.3]],

                   [[ 4.22,  4.5],
                    [ 4.44,  5.3],
                    [ 4. ,  6.43]],

                   [[ 4.4,  5.3],
                    [ 5.1,  5. ],
                    [ 4. ,  6.3]]])

array2 = np.array([[ 4.22,  4.5],
                   [ 4.44,  5.3],
                   [ 4.5,  4.8],
                   [ 4. ,  6.43]])

mask = np.logical_not(np.any(np.all(np.isin(array1, array2), axis=-1), axis=1))
array1_after_deletion = array1[mask]
print(array1_after_deletion)

The output is:

[[[5.1 5. ]
  [6.2 4.4]
  [4.  6.3]]

 [[4.4 5.3]
  [5.1 5. ]
  [4.  6.3]]]