Home > Net >  Split a numpy array based on a given condition
Split a numpy array based on a given condition

Time:05-19

How to split a sorted numpy array, e.g., arr=([5,6,28,29,32,33,87,88,95]) into sub-arrays such that the following two conditions are always met:

(1) The difference between the first and the last elements of a sub-array is less than 10.

(2) And, the difference between the last element of a sub-array and the first element of the next sub-array is more than 20.

In the arr above, expected list is split_arr=[([5,6]),([28,29,32,33]),([87,88,95])].

CodePudding user response:

Like I said, your two conditions have conflicts if arr looks like: [5, 6, 16, 28, 29, 32, 33, 87, 88, 95], and there is no place for the num 16, so the following codes are just under condition 1, waived off condition2:

arr = [5, 6, 28, 29, 32, 33, 87, 88, 95]

results = []

idx = 0

sub_arr = list()

while idx <= len(arr) - 1:
    if not sub_arr:
        sub_arr.append(arr[idx])
    else:
        if arr[idx] - sub_arr[0] < 10:
            sub_arr.append(arr[idx])
        else:
            results.append(sub_arr)
            sub_arr = list()
            sub_arr.append(arr[idx])
    idx  = 1

if sub_arr:
    results.append(sub_arr)

print(results)

Output:

[[5, 6], [28, 29, 32, 33], [87, 88, 95]]

CodePudding user response:

It can be achieved by using just NumPy by starting from condition 2 and then condition 1:

diff1 = np.diff(arr, prepend=arr[0]-21, append=arr[-1] 21)  # [21  1 22  1  3  1 54  1  7 21]
ind = np.where(diff1 > 20)[0]             # [0 2 6 9]
start = ind[:-1]                          # [0 2 6]
end = ind[1:] - 1                         # [1 5 8]
cond1 = (arr[end] - arr[start]) < 10      # [True True True]
result = np.split(arr, start[cond1][1:])  # [array([5, 6]), array([28, 29, 32, 33]), array([87, 88, 95])]
  • Related