Home > front end >  Python/NumPy: Split non-consecutive values into discrete subset arrays
Python/NumPy: Split non-consecutive values into discrete subset arrays

Time:11-22

How can I slice arrays such as this into n-many subsets, where one subset consists of consecutive values?

arr = np.array((0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 39, 40,
       41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 66, 67, 68, 69, 70, 71))
# tells me where they are consecutive
np.where(np.diff(arr) == 1)[0]
# where the break points are
cut_points = np.where(np.diff(arr) != 1)[0]   1
# wont generalize well with n-many situations
arr[:cut_points[0] ]
arr[cut_points[0] : cut_points[1] ]
arr[cut_points[1] :, ]

CodePudding user response:

You can use np.split, and just pass in cut_points as the 2nd argument.

eg.

split_arr = np.split(arr, cut_points)

# split_arr looks like:
# [array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14]),
# array([39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]),
# array([66, 67, 68, 69, 70, 71])]

full solution:

import numpy as np
arr = np.array((0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 39, 40,
       41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 66, 67, 68, 69, 70, 71))
cut_points = np.where(np.diff(arr) != 1)[0]   1
split_arr = np.split(arr, split_points)
split_arr
# outputs:
[array([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14]),
 array([39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55]),
 array([66, 67, 68, 69, 70, 71])]

CodePudding user response:

Just as an alternative way, with no pandas/numpy.

If you don't care about the order of the input/ouput, you start at the end and do something like:

l = (0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 66, 67, 68, 69, 70, 71)
i = 0
current_index = -1
prev_value = None
result = []
for k in l[::-1]:
    current_value = k   i
    if prev_value != current_value:
        prev_value = current_value
        current_index  = 1
        result.append([])
    result[current_index].append(k)
    i  = 1
print(result)

Then result will contain:

[
    [71, 70, 69, 68, 67, 66],
    [55, 54, 53, 52, 51, 50, 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39],
    [14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0]
]
  • Related