Home > Blockchain >  Efficiently iterate over nested lists to find sum
Efficiently iterate over nested lists to find sum

Time:10-18

I have an array of arrays and want to check if the sum equals 40. The problem is that the array has around 270,000,000 elements and doing in sequentially is out of the picture. The problem that I am having is finding the sums in a reasonable amount of time. I have ran this program overnight and it is still running in the morning. How can I make this program more efficient and run decently fast?

Here is my code so far:

import numpy as np


def cartesianProduct(arrays):
    la = arrays.shape[0]
    arr = np.empty([la]   [a.shape[0] for a in arrays], dtype="int32")
    for i, a in enumerate(np.ix_(*arrays)):
        arr[i, ...] = a
    return arr.reshape(la, -1).T


rows = np.array(
    [
        [2, 15, 23, 19, 3, 2, 3, 27, 20, 11, 27, 10, 19, 10, 13, 10],
        [22, 9, 5, 10, 5, 1, 24, 2, 10, 9, 7, 3, 12, 24, 10, 9],
        [16, 0, 17, 0, 2, 0, 2, 0, 10, 0, 15, 0, 6, 0, 9, 0],
        [11, 27, 14, 5, 5, 7, 8, 24, 8, 3, 6, 15, 22, 6, 1, 1],
        [10, 0, 2, 0, 22, 0, 2, 0, 17, 0, 15, 0, 14, 0, 5, 0],
        [1, 6, 10, 6, 10, 2, 6, 10, 4, 1, 5, 5, 4, 8, 6, 3],
        [6, 0, 13, 0, 3, 0, 3, 0, 6, 0, 10, 0, 10, 0, 10, 0],
    ],
    dtype="int32",
)

product = cartesianProduct(rows)
combos = []

for row in product:
    if sum(row) == 40:
        combos.append(row)

print(combos)

CodePudding user response:

I believe what you are trying to do is called NP-hard. Look into "dynamic programming" and "subset sum"

Examples:

CodePudding user response:

As suggested in the comments one way to optimize this is to check if the sum of a sub array already exceeds your threshold (40 in this case). and as another optimization to this you can even sort the arrays incrementally from largest to smallest.

Check heapq.nlargest() for incremental partial sorting.

  • Related