Home > OS >  np.where with arbitrary number of conditions
np.where with arbitrary number of conditions

Time:12-09

Problem

This question: Numpy where function multiple conditions asks how to use np.where with two conditions. This answer suggests to use the & operator between conditions, which works if we have a low number of conditions which can be typed. This answer suggests using the np.logical_and, which can take only two arguments.

This thread: Numpy "where" with multiple conditions also discusses multiple conditions for np.where, but the number of conditions are known in advance.

I am looking for a way to evaluate an np.where expression without knowing the number of conditions in advance.


Reproducible setup

I have a 2D array:

arr = \
np.array([[1,2,3,4],
          [4,5,6,7],
          [9,8,7,6],
          [0,1,0,1],
          [9,7,6,5]])

Select the rows which have, for example, index 1 element larger than 5, index 2 element larger than 3. To do that, I do:

res = arr[np.where((arr[:,1]>5) & (arr[:,2]>4))]

res is then:

array([[9, 8, 7, 6],
       [9, 7, 6, 5]])

as expected.

But what if I have these conditions as lists? The above example would be:

cols = [1,2] # arbitrary length list
tholds = [5,4] # arbitrary length list

These two lists are unknown length in advance, but they have the same length.

How can I get res using the cols and tholds lists?


What I have tried

Use ast.literal_eval to define:

filterstring = "&".join([f"(pdist[:,{col}]>{th})" for col, th in zip(cols,tholds)])

which evaluates to (pdist[:,1]>5)&(pdist[:,2]>4), ie what we had above within np.where() when the conditions are typed out manually.

However, ast.literal_eval(f"np.where({filterstring})") gives an error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-269-1aaff20de82f> in <module>()
----> 1 ast.literal_eval(f"np.where({filterstring})")

3 frames
/usr/lib/python3.7/ast.py in _convert_num(node)
     53         elif isinstance(node, Num):
     54             return node.n
---> 55         raise ValueError('malformed node or string: '   repr(node))
     56     def _convert_signed_num(node):
     57         if isinstance(node, UnaryOp) and isinstance(node.op, (UAdd, USub)):

ValueError: malformed node or string: <_ast.Call object at 0x7f41daa21f10>

So this did not work. This answer to the question ast.literal_eval() malformed node or string while converting a string with list of array()s confirms that this is not the right approach.


EDIT:

The suggestion to use np.wheres in a succession works fine for this particular example, but is not really what I look for. I would want to call np.where once, not multiple times evaluating one condition only.

CodePudding user response:

Tried to avoid eval. It has some security implications.

You could to it iteratively, like so

def unknown_conditions(arr, cols, tholds):

    for col, thold in zip(cols, tholds):
        arr = arr[np.where(arr[:, col] > thold)]
    
    return arr

CodePudding user response:

You could accumulate the number of met conditions and then call the np.where function once. From there it would be very easy to mix and/or combination from the conditions.

(Conceptually very similar to academy's suggestion.)

def filter_by_conditions(arr, cols, tholds):
    n_conditions = len(cols)
    bool_accumulator = np.zeros(arr.shape[0])
    for c, t in zip(cols, tholds):
        bool_accumulator  = (arr[:, c] > t).astype(int)

    return arr[np.where(bool_accumulator) == n_conditions]
  • Related