Let's propose that we have an array arr
and we want to divide the array into pieces
saving the order of elements.
It can be easily done using np.array_split
:
import numpy
arr = np.array([0,1,2,3,4,5,6,7,8])
pieces = 3
np.array_split(arr,pieces)
>>> [array([0, 1, 2]), array([3, 4, 5]), array([6, 7, 8])]
If arr.size % pieces != 0
the output of np.array_split
will be uneven:
arr = np.array([0,1,2,3,4,5,6,7])
pieces = 3
np.array_split(arr,pieces)
>>> [array([0, 1, 2]), array([3, 4, 5]), array([6, 7])]
I am wondering what is the best way to add randomization to the procedure to get the following outputs with equal probability:
>>> [array([0, 1]), array([2, 3, 4]), array([5, 6, 7])]
>>> [array([0, 1, 2]), array([3, 4]), array([5, 6, 7])]
>>> [array([0, 1, 2]), array([3, 4, 5]), array([6, 7])]
I am interested in generalized solution which will also work for other combinations of array size and number of pieces, for example:
arr = np.array([0,1,2,3,4,5,6,7,8,9])
pieces = 6
CodePudding user response:
def random_arr_split(arr, n):
# NumPy doc: For an array of length l that should be split into n sections,
# it returns l % n sub-arrays of size l//n 1 and the rest of size l//n
piece_lens = [arr.size // n 1] * (arr.size % n) [arr.size // n] * (n - arr.size % n)
piece_lens_shuffled = np.random.permutation(piece_lens)
# drop the last element, which is the end of the array
# otherwise getting an empty array at the end
split_indices = np.cumsum(piece_lens_shuffled)[:-1]
return np.array_split(arr, split_indices)