Numpy - Same Dtype Arrays Comparison - Depreciationwarning: Elemetwise comparison failed-CodePudding

I am given an original list of floats, where the integer value indicates the type of experiment, and the decimal value indicates which number of times it was conducted.

My job is to remove all the floats whose integer value appear less than 3 times.

This is my 3rd time using Numpy, so I googled around a bit and all the answers to this error basically said the same things: They have to be of same data type, and they have to have the same dimensionality (shape?).

I've looked into both arrays (needToBeRemoved and newId,) and both are:

*Datatypes of 32int *1D shapes.

So how come the script gives me the following error:

"DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
  indexOfRemoval = np.where(newId == needToBeRemoved) #Finds the indexes of all numbers that need to be removed"

And what can I do to fix it?

My code:

import numpy as np
import math
import warnings

def removeIncomplete(id):
    needToBeRemoved = []
    maxMinOne = math.floor(max(id))


    newId = np.array([math.floor(y) for y in id if 0<y and y<(maxMinOne  1) ]) #Turns the list into integers

    print(f"NewId : {newId}")
    for i in newId:
        print(i)
        occurences = np.count_nonzero(newId == i)
        print(f" This is occurences: {occurences} ")#figures out how many times each element appears in this list

        if occurences<3: #Makes sure the element only appears in the needToBeRemoved list once
            needToBeRemoved = needToBeRemoved   [i] #Creates a list of all numbers which occur less than 3 times
            print(f" Numbers that need to be removed: {needToBeRemoved}")

    needToBeRemoved = np.array(needToBeRemoved)


    indexOfRemoval = np.where(newId == needToBeRemoved) #Finds the indexes of all numbers that need to be removed

    id = np.delete(id,indexOfRemoval) #Removes all elements at these index positions
    return id

arr = np.array([1.3, 2.2, 2.3, 4.2, 5.1, 3.2, 5.3, 3.3, 2.1, 1.1, 5.2, 3.1])

removeIncomplete(arr)

NOTE: The output HAS to be a np.array according to the assignemnt.

CodePudding user response：

With your arr (id):

In [322]: import math 
In [323]: maxMinOne = math.floor(max(arr))
In [324]: newId = np.array([math.floor(y) for y in arr if 0<y and y<(maxMinOne  1) ])
In [325]: newId
Out[325]: array([1, 2, 2, 4, 5, 3, 5, 3, 2, 1, 5, 3])

comparing the array to a scalar or single element array:

In [326]: newId==np.array([2])
Out[326]: 
array([False,  True,  True, False, False, False, False, False,  True,
       False, False, False])

but comparing this to an array of a different size produces:

In [327]: newId==np.array([2,3])
<ipython-input-327-c00ae167502e>:1: DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
  newId==np.array([2,3])
Out[327]: False

you could also compare it to another array of the same size:

In [328]: arr.shape
Out[328]: (12,)
In [329]: newId==np.ones(12,int)*2
Out[329]: 
array([False,  True,  True, False, False, False, False, False,  True,
       False, False, False])

To compare it to an array of a different size, we could use a broadcasted equal:

In [342]: newId==np.array([2,3])[:,None]
Out[342]: 
array([[False,  True,  True, False, False, False, False, False,  True,
        False, False, False],
       [False, False, False, False, False,  True, False,  True, False,
        False, False,  True]])

and finding where it's true in any row:

In [343]: (newId==np.array([2,3])[:,None]).any(axis=0)
Out[343]: 
array([False,  True,  True, False, False,  True, False,  True,  True,
       False, False,  True])

and the index:

In [346]: idx=np.where((newId==np.array([2,3])[:,None]).any(axis=0))
In [347]: idx
Out[347]: (array([ 1,  2,  5,  7,  8, 11]),)
In [348]: newId[idx]
Out[348]: array([2, 2, 3, 3, 2, 3])
In [349]: np.delete(arr, idx)
Out[349]: array([1.3, 4.2, 5.1, 5.3, 1.1, 5.2])

isin can also be used

In [351]: np.isin(newId, np.array([2,3]))
Out[351]: 
array([False,  True,  True, False, False,  True, False,  True,  True,
       False, False,  True])

counting

In [356]: u,c=np.unique(newId, return_counts=True)
In [357]: u
Out[357]: array([1, 2, 3, 4, 5])
In [358]: c
Out[358]: array([2, 3, 3, 1, 3])
In [359]: u[c<3]
Out[359]: array([1, 4])

compare that with your loop:

In [360]: def foo(newId):
     ...:     needToBeRemoved = []
     ...:     for i in newId:
     ...:         occurences = np.count_nonzero(newId == i)
     ...:         if occurences<3:
     ...:             needToBeRemoved = needToBeRemoved   [i]
     ...:     return np.array(needToBeRemoved)
     ...: 
In [361]: foo(newId)
Out[361]: array([1, 4, 1])

CodePudding user response：

you don't need to use numpy module . Do the following :

arr = [1.3, 2.2, 2.3, 4.2, 5.1, 3.2, 5.3, 3.3, 2.1, 1.1, 5.2, 3.1]
new_arr = []


for i in arr :
     if int(i) >=3 :
          new_arr.append(i)

Note: the useful data for you is in the new_arr