Different reslults with np.searchsorted and np.argmin during finding nearest indexes-CodePudding

I have a set of timestamp (arr) data and list with starts and ends (cuts), the purpose is to intercept the data of the timestamp between the start and end and generate a new array. I have tried with two methodes, with np.searchsorted() and np.argmin(), but they give the different results. Any explication for this?

Thank you!

Here is my code:

import numpy as np

# Initialization data  
arr = np.arange(761.55643, 1525.5704932002686, 1/ 1000)

cuts = [[810.211186646, 899.102014549], [903.520741867, 982.000921478], [985.201032795, 993.400610844], 
       [998.303881868, 1085.500698357], [1090.200656211, 1168.101925871], [1171.299249968, 1179.611318749], 
       [1184.610645285, 1271.597569677], [1275.600586067, 1363.696138556], [1368.301122947, 1455.500707533]]
# Function


vector_validity = np.zeros(len(arr))
new_arr_with_argmin = np.zeros(0)
for cut in cuts:
    vector_validity[int(np.searchsorted(arr, cut[0])) : int(np.searchsorted(arr, cut[1]))] = 1
    print(f"np.searchsorted start: {np.searchsorted(arr, cut[0])}")
    print(f"np.argmin start: {np.argmin(abs(arr - cut[0]))}")
    print(f"np.searchsorted end: {np.searchsorted(arr, cut[1])}")
    print(f"np.argmin end: {np.argmin(abs(arr - cut[1]))}")
    
    new_arr_with_argmin = np.concatenate((new_arr_with_argmin, arr[np.argmin(abs(arr - cut[0])) : np.argmin(abs(arr - cut[1]))]))
new_arr_with_searchsorted = arr[vector_validity == 1]

The result of the print:


>     np.searchsorted start: 48655
>     np.argmin start: 48655
>     np.searchsorted end: 137546
>     np.argmin end: 137546
>     np.searchsorted start: 141965
>     np.argmin start: 141964
>     np.searchsorted end: 220445
>     np.argmin end: 220444
>     np.searchsorted start: 223645
>     np.argmin start: 223645
>     np.searchsorted end: 231845
>     np.argmin end: 231844
>     np.searchsorted start: 236748
>     np.argmin start: 236747
>     np.searchsorted end: 323945
>     np.argmin end: 323944
>     np.searchsorted start: 328645
>     np.argmin start: 328644
>     np.searchsorted end: 406546
>     np.argmin end: 406545
>     np.searchsorted start: 409743
>     np.argmin start: 409743
>     np.searchsorted end: 418055
>     np.argmin end: 418055
>     np.searchsorted start: 423055
>     np.argmin start: 423054
>     np.searchsorted end: 510042
>     np.argmin end: 510041
>     np.searchsorted start: 514045
>     np.argmin start: 514044
>     np.searchsorted end: 602140
>     np.argmin end: 602140
>     np.searchsorted start: 606745
>     np.argmin start: 606745
>     np.searchsorted end: 693945
>     np.argmin end: 693944

So we can find that from interval 2, two methodes give different indexes. Any explication for this result?

CodePudding user response：

The argmin method finds the index of closest value, which is not what searchsorted does.

Here's a simple example:

In [130]: a = np.array([1, 2])

For inputs such as v=1.05 and v=1.95 (both between 1 and 2), the position returned by searchsorted(a, v) is 1:

In [131]: np.searchsorted(a, [1.05, 1.95])
Out[131]: array([1, 1])

Your method based on argmin does not give the same result for input values that are closer to 1 than 2:

In [137]: np.argmin(abs(a - 1.05))
Out[137]: 0

In [138]: np.argmin(abs(a - 1.5))
Out[138]: 0

In [139]: np.argmin(abs(a - 1.51))
Out[139]: 1

In [140]: np.argmin(abs(a - 1.95))
Out[140]: 1