Problem
This question: Numpy where function multiple conditions asks how to use np.where
with two conditions. This answer suggests to use the &
operator between conditions, which works if we have a low number of conditions which can be typed. This answer suggests using the np.logical_and
, which can take only two arguments.
This thread: Numpy "where" with multiple conditions also discusses multiple conditions for np.where
, but the number of conditions are known in advance.
I am looking for a way to evaluate an np.where
expression without knowing the number of conditions in advance.
Reproducible setup
I have a 2D array:
arr = \
np.array([[1,2,3,4],
[4,5,6,7],
[9,8,7,6],
[0,1,0,1],
[9,7,6,5]])
Select the rows which have, for example, index 1 element larger than 5, index 2 element larger than 3. To do that, I do:
res = arr[np.where((arr[:,1]>5) & (arr[:,2]>4))]
res
is then:
array([[9, 8, 7, 6],
[9, 7, 6, 5]])
as expected.
But what if I have these conditions as lists? The above example would be:
cols = [1,2] # arbitrary length list
tholds = [5,4] # arbitrary length list
These two lists are unknown length in advance, but they have the same length.
How can I get res
using the cols
and tholds
lists?
What I have tried
Use ast.literal_eval
to define:
filterstring = "&".join([f"(pdist[:,{col}]>{th})" for col, th in zip(cols,tholds)])
which evaluates to (pdist[:,1]>5)&(pdist[:,2]>4)
, ie what we had above within np.where()
when the conditions are typed out manually.
However, ast.literal_eval(f"np.where({filterstring})")
gives an error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-269-1aaff20de82f> in <module>()
----> 1 ast.literal_eval(f"np.where({filterstring})")
3 frames
/usr/lib/python3.7/ast.py in _convert_num(node)
53 elif isinstance(node, Num):
54 return node.n
---> 55 raise ValueError('malformed node or string: ' repr(node))
56 def _convert_signed_num(node):
57 if isinstance(node, UnaryOp) and isinstance(node.op, (UAdd, USub)):
ValueError: malformed node or string: <_ast.Call object at 0x7f41daa21f10>
So this did not work. This answer to the question ast.literal_eval() malformed node or string while converting a string with list of array()s confirms that this is not the right approach.
EDIT:
The suggestion to use np.where
s in a succession works fine for this particular example, but is not really what I look for. I would want to call np.where
once, not multiple times evaluating one condition only.
CodePudding user response:
Tried to avoid eval. It has some security implications.
You could to it iteratively, like so
def unknown_conditions(arr, cols, tholds):
for col, thold in zip(cols, tholds):
arr = arr[np.where(arr[:, col] > thold)]
return arr
CodePudding user response:
You could accumulate the number of met conditions and then call the np.where
function once. From there it would be very easy to mix and/or combination from the conditions.
(Conceptually very similar to academy's suggestion.)
def filter_by_conditions(arr, cols, tholds):
n_conditions = len(cols)
bool_accumulator = np.zeros(arr.shape[0])
for c, t in zip(cols, tholds):
bool_accumulator = (arr[:, c] > t).astype(int)
return arr[np.where(bool_accumulator) == n_conditions]