My concern involves huge arrays with shapes like (14!, 14), but I'll ask the question using a much smaller array.
Consider array p
holding the 10!
permutations of a = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
We can create a permutation array of this shape (ie: 3628800, 10) in a variety of ways, say:
p = np.array(list(itertools.permutations(range(10))))
QUESTION: I'd like to know if there is any way I could produce, say:
array p1
holding the first 100000 permutations, then
array p2
holding the next 100000 permutations, then
etc..., then
array p37
holding the last 28800 permutations.
I'm not talking about creating the full set of permutations, then subdividing it. What I'd like to know is whether I can actually generate the permutation rows in 'clumps' of suitable size. The actual order of rows in each 'clump' isn't an issue, as long as the full set of 'clumps' holds all permutations without any overlap.
As mentioned earlier, my actual concern is to find a way, in principal, to handle much larger arrays of permutations. I'll worry about the size of the 'clumps', etc, later.
CodePudding user response:
Use itertools.islice
in the batched
recipe:
from itertools import islice, permutations
def batched(iterable, n):
"Batch data into tuples of length n. The last batch may be shorter."
# batched('ABCDEFG', 3) --> ABC DEF G
if n < 1:
raise ValueError('n must be at least one')
it = iter(iterable)
while (batch := tuple(islice(it, n))):
yield batch
perm = permutations(range(10))
arrays = [np.array(x) for x in batched(perm, 100000)]
If you want to iterate by chunk:
perm = permutations(range(10))
for x in batched(perm, 100000):
a = np.array(x)
print(a)