Home > Blockchain >  How to make itertools combinations 'increase' evenly?
How to make itertools combinations 'increase' evenly?

Time:09-29

Consider the following example:

import itertools
import numpy as np

a = np.arange(0,5)
b = np.arange(0,3)
c = np.arange(0,7)

prods = itertools.product(a,b,c)

for p in prods:
    print(p)

This iterate over the products in the following order:

(0, 0, 0)
(0, 0, 1)
(0, 0, 2)
(0, 0, 3)
(0, 0, 4)
(0, 1, 0)

But I would much rather have the products given in order of their sum, e.g.

(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(1, 0, 0)
(0, 1, 1)
(1, 0, 1)
(1, 1, 0)
(0, 0, 2)

How can I achieve this without storing all combinations in memory?

Note: a b and c are always ranges, but not necessarily with the same maximum. There is also no 2nd-level ordering when the sums of two products are equal, i.e. (0,1,1) is equivalent to (2,0,0).

CodePudding user response:

The easiest way to do this without storing extra products in memory is with recursion. Instead of range(a,b), pass in a list of (a,b) pairs and do the iteration yourself:

def prod_by_sum(range_bounds: List[Tuple[int, int]]):
    """
    Yield from the Cartesian product of input ranges, produced in order of sum.

    >>> range_bounds = [(2, 4), (3, 6), (0, 2)]
    >>> for prod in prod_by_sum(range_bounds):
    ...    print(prod)
    (2, 3, 0)
    (2, 3, 1)
    (2, 4, 0)
    (3, 3, 0)
    (2, 4, 1)
    (2, 5, 0)
    (3, 3, 1)
    (3, 4, 0)
    (2, 5, 1)
    (3, 4, 1)
    (3, 5, 0)
    (3, 5, 1)

    """
    def prod_by_sum_helper(start: int, goal_sum: int):
        low, high = range_bounds[start]
        if start == len(range_bounds) - 1:
            if low <= goal_sum < high:
                yield (goal_sum,)
            return

        for current in range(low, min(high, goal_sum   1)):
            yield from ((current,)   extra
                        for extra in prod_by_sum_helper(start   1, goal_sum - current))

    lowest_sum = sum(lo for lo, hi in range_bounds)
    highest_sum = sum(hi - 1 for lo, hi in range_bounds)

    for goal_sum in range(lowest_sum, highest_sum   1):
        yield from prod_by_sum_helper(0, goal_sum)

which has output for range_bounds = [(0, 5), (0, 3), (0, 7)] starting with:

(0, 0, 0)
(0, 0, 1)
(0, 1, 0)
(1, 0, 0)
(0, 0, 2)
(0, 1, 1)
(0, 2, 0)
(1, 0, 1)
(1, 1, 0)
(2, 0, 0)

You can do this exact process iteratively by modifying a single list and yielding copies of it, but the code either becomes more complicated or less efficient.

You can also trivially modify this to support steps besides 1, however that does work less efficiently with larger and larger steps since the last range might not contain the element needed to produce the current sum. That seems unavoidable, because at that point you'd need to solve a difficult computational problem to efficiently loop over those products by sum.

CodePudding user response:

If the steps are always 1 and avoiding storing all combinations is your top priority, you could do the following (partially using itertools.product):

import itertools
import numpy as np

def weak_compositions(boxes, balls, parent=tuple()):
    """https://stackoverflow.com/a/36748940/4001592"""
    if boxes > 1:
        for i in range(balls   1):
            for x in weak_compositions(boxes - 1, i, parent   (balls - i,)):
                yield x
    else:
        yield parent   (balls,)


def verify_limits(x, minimum, maximum):
    all_max = all(xi <= li for xi, li in zip(x, maximum))
    all_min = all(xi >= li for xi, li in zip(x, minimum))
    return all_max and all_min


def iterate_in_sum(ranges):
    prods = itertools.product(*ranges)

    # find number of different sums
    unique_total_sums = sorted(set(sum(p) for p in prods))

    # find the minimum limits
    minimum_allowed = [min(r) for r in ranges]

    # find the maximum limits
    maximum_allowed = [max(r) for r in ranges]

    for total_sum in unique_total_sums:
        # decompose each sum into its summands
        for x in weak_compositions(len(ranges), total_sum):

            # if the decomposition meets the limits
            if verify_limits(x, minimum_allowed, maximum_allowed):
                yield x


a = np.arange(0, 5)
b = np.arange(0, 3)
c = np.arange(0, 7)

for s in iterate_in_sum([a, b, c]):
    print(s, sum(s))

Output (partial)

(0, 0, 0) 0
(1, 0, 0) 1
(0, 1, 0) 1
(0, 0, 1) 1
(2, 0, 0) 2
(1, 1, 0) 2
(1, 0, 1) 2
(0, 2, 0) 2
(0, 1, 1) 2
(0, 0, 2) 2
(3, 0, 0) 3
(2, 1, 0) 3
(2, 0, 1) 3
(1, 2, 0) 3
(1, 1, 1) 3
(1, 0, 2) 3
(0, 2, 1) 3
(0, 1, 2) 3

The core of the solution is the weak_compositions function that will decompose a number into it's summands (something like integer partition). More solutions to the above problem of composition of n into k parts can be found here.

Note:

The solution can be extended to non uniform steps with additional complexity cost.

  • Related