Make a FOR LOOP with numpy arrays multiplication and "appending" faster-CodePudding

This is my current code.

It works fine when the number of elements in the second "for loop" is low (around 10k) and it takes only a few seconds, but when he number of elements in the second "for loop" is high (around 40k) the time it takes is around 60 seconds or more: why?

For example: sometimes when the 2nd for loop has 28k elements it takes less time to execute it than when it has 7k elements. I don't understand why the time isn't linearly dependent on the number of operations.

Also, as a general rule, the longer the code runs, the bigger the loop times become.

To recap, the execution times usually follow these rules:

when operations < 10k , time < 5 seconds
when 10k < operations < 40k , 10s < time < 30s (seems random)
when operations > 40k , time ~ 60 seconds

from random import random
import numpy as np
import time
import gc
from collections import deque
import random

center_freq = 20000000 
smpl_time = 0.03749995312*pow(10,-6) 

mat_contents = []

for i in range(10):
    mat_contents.append(np.ones((3600, random.randint(3000,30000))))

tempo = np.empty([0,0])

for i in range(3600):
    tempo = np.append(tempo, center_freq*smpl_time*i)

ILO = np.cos(tempo) 

check = 0

for element in mat_contents:
    start_time = time.time()
    Icolumn = np.empty([0,0])
    Imatrix = deque([])
    gc.disable()

    for colonna in element.T:
        Imatrix.append(np.multiply(ILO, colonna))

    gc.enable()
    varI = np.var(Imatrix)
    tempImean = np.mean(Imatrix)

    print('\nSize: ', len(element.T))
    print("--- %s seconds ---" % (time.time() - start_time))
    print("---  ", check, ' ---')
    check  = 1

CodePudding user response：

Your code as is killed my session, presumably because the array dimension(s) was too big for memory. Trying again with smaller numbers:

   In [1]: from collections import deque
   ...: import random
   ...: 
   ...: center_freq = 20000000
   ...: smpl_time = 0.03749995312*pow(10,-6)
   ...: 
   ...: mat_contents = []
   ...: 
   ...: for i in range(10):
   ...:     mat_contents.append(np.ones((36, random.randint(30,300))))
   ...: 
   ...: tempo = np.empty([0,0])
   ...: 
   ...: for i in range(3600):
   ...:     tempo = np.append(tempo, center_freq*smpl_time*i)
   ...: 
   ...: ILO = np.cos(tempo)
   ...: 
   ...: check = 0
In [2]: 
In [2]: tempo.shape
Out[2]: (3600,)
In [3]: ILO
Out[3]: 
array([ 1.        ,  0.73168951,  0.07073907, ..., -0.63602366,
       -0.99137119, -0.81472814])
In [4]: len(mat_contents)
Out[4]: 10

A quick note - while list append in your mat_contents loop is relatively fast, the np.append in the next loop sucks bad. Don't use it that way.

I don't know if I have time now to look at the next loop, BUT, why are you using a deque there? Why not a list? I haven't worked with a deque much, but I don't think it has any advantages over list with this right side append.

CorrectingILO for the reduced shape

In [16]:         tempo = np.empty([0,0])
    ...:    ...:
    ...:    ...: for i in range(36):
    ...:    ...:     tempo = np.append(tempo, center_freq*smpl_time*i)
    ...: 
In [17]: tempo.shape
Out[17]: (36,)
In [18]: ILO = np.cos(tempo)

I suspect that loop can replaced by one numpy expression, but I won't dig into that yet.

In [19]: %%timeit
    ...: for element in mat_contents:
    ...:     Icolumn = np.empty([0,0])
    ...:     Imatrix = deque([])
    ...:     for colonna in element.T:
    ...:         Imatrix.append(np.multiply(ILO, colonna))
    ...:     varI = np.var(Imatrix)
    ...:     tempImean = np.mean(Imatrix)
    ...: 
5.82 ms ± 6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

Time with Imatrix = [] is the same, so using deque doesn't hurt.

mat_contents contains arrays with varying shape (2nd dimension):

In [22]: [x.shape for x in mat_contents]
Out[22]: 
[(36, 203),
 (36, 82),
 (36, 194),
 (36, 174),
 (36, 119),
 (36, 190),
 (36, 272),
 (36, 68),
 (36, 293),
 (36, 248)]

Now lets examine the slow loop that you are worried about:

In [23]: ILO.shape
Out[23]: (36,)
In [24]: element=mat_contents[0]
In [25]: element.T.shape
Out[25]: (203, 36)
In [26]: Imatrix =[]
    ...: for colonna in element.T:
    ...:         Imatrix.append(np.multiply(ILO, colonna))
    ...: arr = np.array(Imatrix)
In [27]: arr.shape
Out[27]: (203, 36)

To multiply a (36,) array and a (203,36) we don't need to loop.

In [28]: np.allclose(ILO*element.T, arr)
Out[28]: True
In [29]: %%timeit
    ...: Imatrix =[]
    ...: for colonna in element.T:
    ...:         Imatrix.append(np.multiply(ILO, colonna))
    ...: arr = np.array(Imatrix)
    ...: 
    ...: 
405 µs ± 2.72 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

The whole-array operation is much faster:

In [30]: timeit ILO*element.T
19.4 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)

In your original question you loaded a .mat file. Are you trying to translate MATLAB code? numpy is like old-time MATLAB, where whole-matrix operations are much faster. numpy does not do jit compilation of loops.