Home > OS >  What is the most efficient way to deal with a loop on NumPy arrays?
What is the most efficient way to deal with a loop on NumPy arrays?

Time:06-03

The question is simple: here is my current algorithm. This is terribly slow because of the loops on the arrays. Is there a way to change it in order to avoid the loops and take advantage of the NumPy arrays types ?

import numpy as np

def loopingFunction(listOfVector1, listOfVector2):
    resultArray = []

    for vector1 in listOfVector1:
        result = 0

        for vector2 in listOfVector2:
            result  = np.dot(vector1, vector2) * vector2[2]

        resultArray.append(result)

    return np.array(resultArray)

listOfVector1x = np.linspace(0,0.33,1000)
listOfVector1y = np.linspace(0.33,0.66,1000)
listOfVector1z = np.linspace(0.66,1,1000)

listOfVector1 = np.column_stack((listOfVector1x, listOfVector1y, listOfVector1z))

listOfVector2x = np.linspace(0.33,0.66,1000)
listOfVector2y = np.linspace(0.66,1,1000)
listOfVector2z = np.linspace(0, 0.33, 1000)

listOfVector2 = np.column_stack((listOfVector2x, listOfVector2y, listOfVector2z))

result = loopingFunction(listOfVector1, listOfVector2)

I am supposed to deal with really big arrays, that have way more than 1000 vectors in each. So if you have any advice, I'll take it.

CodePudding user response:

You can at least remove the two forloop to save alot of time, use matrix computation directly

import time

import numpy as np

def loopingFunction(listOfVector1, listOfVector2):
    resultArray = []

    for vector1 in listOfVector1:
        result = 0

        for vector2 in listOfVector2:
            result  = np.dot(vector1, vector2) * vector2[2]

        resultArray.append(result)

    return np.array(resultArray)

def loopingFunction2(listOfVector1, listOfVector2):
    resultArray = np.sum(np.dot(listOfVector1, listOfVector2.T) * listOfVector2[:,2], axis=1)

    return resultArray

listOfVector1x = np.linspace(0,0.33,1000)
listOfVector1y = np.linspace(0.33,0.66,1000)
listOfVector1z = np.linspace(0.66,1,1000)

listOfVector1 = np.column_stack((listOfVector1x, listOfVector1y, listOfVector1z))

listOfVector2x = np.linspace(0.33,0.66,1000)
listOfVector2y = np.linspace(0.66,1,1000)
listOfVector2z = np.linspace(0, 0.33, 1000)

listOfVector2 = np.column_stack((listOfVector2x, listOfVector2y, listOfVector2z))
import time
t0 = time.time()
result = loopingFunction(listOfVector1, listOfVector2)
print('time old version',time.time() - t0)
t0 = time.time()
result2 = loopingFunction2(listOfVector1, listOfVector2)
print('time matrix computation version',time.time() - t0)
print('Are results are the same',np.allclose(result,result2))

Which gives

time old version 1.174513578414917
time matrix computation version 0.011968612670898438
Are results are the same True

Basically, the less loop the better.

CodePudding user response:

The obligatory np.einsum benchmark

r2 = np.einsum('ij, kj, k->i', listOfVector1, listOfVector2, listOfVector2[:,2], optimize=['einsum_path', (1, 2), (0, 1)])
#%timeit result: 10000 loops, best of 5: 116 µs per loop

np.testing.assert_allclose(result, r2)

CodePudding user response:

This is about 200 times faster than nested loops:

>>> (listOfVector1.dot(listOfVector2.T) * listOfVector2[:, 2]).sum(-1)

Test:

>>> timeit(lambda: loopingFunction(listOfVector1, listOfVector2), number=1))
1.4389081999834161
>>> timeit(lambda: (listOfVector1.dot(listOfVector2.T) * listOfVector2[:, 2]).sum(-1), number=1))
0.007342299999436364

There will be some subtle differences in their calculation results, but I think it's harmless:

>>> out1 = loopingFunction(listOfVector1, listOfVector2)
>>> out2 = (listOfVector1.dot(listOfVector2.T) * listOfVector2[:, 2]).sum(-1)
>>> np.all(out1 == out2)
False
>>> np.all(np.abs(out1 - out2) < 1e-10)
True
  • Related