Home > Software engineering >  Get N Smallest values from numpy array with array size potentially less than N
Get N Smallest values from numpy array with array size potentially less than N

Time:01-25

I am trying to use numpy.argpartition to get the n smallest values from an array. However, I cannot guarantee that there will be at least n values in the array. If there are fewer than n values, I just need the entire array.

Currently I am handling this with checking the array size, but I feel like I'm missing a native numpy method that will avoid this branching check.

if np.size(arr) < N: 
    return arr 
else:
    return arr[np.argpartition(arr, N)][:N]

Minimal reproducible example:

import numpy as np

#Find the 4 smallest values in the array
#Arrays can be arbitrarily sized, as it's the result of finding all elements in a larger array
# that meet a threshold
small_arr = np.array([3,1,4])
large_arr = np.array([3,1,4,5,0,2])

#For large_arr, I can use np.argpartition just fine:
large_idx = np.argpartition(large_arr, 4)
#large_idx now contains array([4, 5, 1, 0, 2, 3])

#small_arr results in an indexing error doing the same thing:
small_idx = np.argpartition(small_arr, 4)
#ValueError: kth(=4) out of bounds (3)

I've looked through the numpy docs for truncation, max length, and other similar terms, but nothing came up that is what I need.

CodePudding user response:

One way (depending on your situation) is just to cap the arg to argparse when the array is shorter with min:

return arr[np.argpartition(arr, min(N, arr.size - 1)][:N]

Slicing tolerates higher values than the array length, it's just argpartition that needs the min() check.

This is less efficient than your branch version, since it has to do the argpartition even when you just want the whole array, but it's more concise. So it depends what your priorities are - personally I'd probably keep the branch or use a ternary:

return arr if arr.size < N else arr[np.argpartition(arr, N)][:N]

CodePudding user response:

You can try:

def nsmallest(array, n):
    return array[np.argsort(array)[:n]]

where n is the number of smallest values that you want.

  • Related