How to convert the following processing using numpy-CodePudding

I am trying to improve a part of code that is slowing down the whole script significantly, right to the point of making it unfeasible. In particular the piece of code is:

for vectors1 in EC1:
    for vectors2 in EC2:
        r = np.add(vectors1, vectors2)
        for vectors3 in CDC:
            result = np.add(r, vectors3).tolist()
            if result not in states:  # This is what makes it very slow
                states.append(result)

EC1, EC2 and CDC are lists that contains as elements, lists of lists, as an example of one iteration, we get:

vectors1: [[2, 0, 0], [0, 0, 0], [0, 0, 0], [2, 0, 0], [0, 0, 0], [0, 0, 0], [2, 0, 0], [2, 0, 0], [0, 0, 0]]

vectors2: [[0, 0, 0], [2, 0, 0], [0, 0, 0], [0, 0, 0], [2, 0, 0], [2, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]

vectors3: [[0, 0, 0], [0, 0, 0], [2, 1, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [2, 1, 0], [2, 1, 0]]

result:   [[2, 0, 0], [2, 0, 0], [2, 1, 0], [2, 0, 0], [2, 0, 0], [2, 0, 0], [2, 0, 0], [4, 1, 0], [2, 1, 0]]

Notice how vectors1, vectors2 and vectors3 correspond to one element from EC1, EC2 and CDC respectively, also how 'result' is the summation from vectors1, vectors2 and vectors3, hence the previous vectors cannot be altered in any manner or sorted, otherwise it would change the expected result from the 'result' variable.

In the first two loops each item in EC1 and EC2 are summed, for later on sum up the previous result with items in CDC. To sum the list of lists from EC1 and EC2 and later on the previous result ('r') with the list of lists from CDC I use numpy.add(). Finally, I reconvert 'result' back to list. So Basically I am managing lists of lists as elements from EC1, EC2 and CDC.

The problem is that I must deal with hundreds of thousands (close to 1M) of results and having to check if a result exists in states list is slowing things drastically, specially since states list grows as more results are processed.

I've tried to keep inside the numpy world by managing everything as numpy arrays. First declaring states as:

states = np.empty([9, 3], int)

Then, concatenating the result numpy array to states numpy array, prior checking if already exists in states:

for vectors1 in EC1:
    for vectors2 in EC2:
        r = np.add(vectors1, vectors2)
        for vectors3 in CDC:
            result = np.add(r, vectors3)
            if not np.isin(states, result).any():
                np.concatenate(states, result, axis=0)

But definitely I am doing something wrong because result is not being concatenated to states, I've also tried without success:

np.append(states, result, axis=0)

Could this be parallelized in some way?

CodePudding user response：

You can do the sums solely in numpy by using broadcasting

res = ((EC1[:,None,:]   EC2).reshape(-1, 1, 3)   CDC).reshape(-1, 3)

given that EC1, EC2 and CDC are arrays.

Afterwards you can filter out the duplicates with

np.unique(res, axis=0)

But like Lucas, I would strongly advise you to filter the arrays beforehand. For your example arrays that would shrink the number of rows in res from 729 to 8.

CodePudding user response：

I'm not sure how large the data are that you are working with but this may speed things up somewhat:

EC1 = [[2, 0, 0], [0, 0, 0], [0, 0, 0], [2, 0, 0], [0, 0, 0], [0, 0, 0], [2, 0, 0], [2, 0, 0], [0, 0, 0]]
EC2 = [[0, 0, 0], [2, 0, 0], [0, 0, 0], [0, 0, 0], [2, 0, 0], [2, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0]]
CDC = [[0, 0, 0], [0, 0, 0], [2, 1, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [0, 0, 0], [2, 1, 0], [2, 1, 0]]
EC1.sort()
EC2.sort()
CDC.sort()
unique_triples = dict()

for v1 in EC1:
    for v2 in EC2:
        for v3 in CDC:
            if str(v1) str(v2) str(v3) not in unique_triples: # list not hashable but strings are
                unique_triples[str(v1) str(v2) str(v3)] = list(np.add(np.add(v1, v2), v3))

The basic idea is to remove duplicate triples of (EC1,EC2, CDC) entries and only do the additions on unique triples, sort the lists so that they are ordered lexicographically

A dictionary has O(1) lookups so these lookups are (maybe) faster.

Whether this is faster or not might depend on how large-and how many unique values of triples-the data are that are being processed.

The 3-vector sums are the values of the dictionary, e.g. list(unique_triples.values()) for me gives:

>>> list(unique_triples.values())
[[0, 0, 0], [2, 1, 0], [2, 0, 0], [4, 1, 0], [2, 0, 0], [4, 1, 0], [4, 0, 0], [6, 1, 0]]

I did not remove the duplicates in the original lists of lists here. If the application you are looking at allows, it is also likely beneficial to remove these duplicates in EC1, EC2, and CDC before iterating over the values.