In this operation, the array
was sliced over a range
.
Such that, given the array
arr = np.array([.1, .11, .21, .01, .5, .7, .91, .92, .95, .96, .1, .21, .23, .6, .7, .71, .72, .95, 0.96, 0.97])
and a range of values,
Step 1
drange = np.arange(start_, end_)
The slicing was conducted as below
Step 2
select_val = arr[drange]
Then the select_val
was check for values larger than a threshold, th
.
Step 3
bool_data = select_val<th
Finally, using argmin
to return the indices of the minimum values along an axis.
Step 4
doutput = np.argmin(bool_data)
In my case, the variable start_
, end_
was stored in a Pandas Dataframe:
df = pd.DataFrame(dict(s=[1, 10], e=[12, 19]))
whereas, the arr is as of Numpy
type.
Currently, I employ Pandas' apply
to a function which compress all the steps 1-4:
def fx(arr, st, en, th):
return np.argmin(arr[np.arange(st, en)] < th)
However, is it possible to employ a vectorization approach instead?
The code of the current strategy is as below:
def fx(arr, st, en, th):
return np.argmin(arr[np.arange(st, en)] < th)
th = 0.9
np.random.seed(0)
arr = np.array([.1, .11, .21, .01, .5, .7, .91, .92, .95, # 8 select 6 range: 1-12
.96, .1, .21, .23, .6, .7, .71, .72, .95, 0.96, 0.97]) # Select 15 range 10-17
df = pd.DataFrame(dict(s=[1, 10], e=[12, 19]))
df['opt'] = df.apply(lambda x: fx(arr, x['s'], x['e'], th), axis=1)
CodePudding user response:
NumPy broadcasting
m1 = arr[:, None] > th
ix = np.arange(len(arr))[:, None]
m2 = (ix >= list(df.s)) & (ix < list(df.e))
df['opt'] = np.argmax(m1 & m2, axis=0) - df.s
Result
s e opt
0 1 12 5
1 10 19 7