I would like to make the following set of operations in a more efficient way:
import numpy as np
from time import time
n1 = 5
n2 = 800
n3 = 1000
a = np.random.rand( n1,n2,n3 )
ts = time()
a_new = np.array( [ np.repeat( a[np.newaxis,ll,:,:], n1, axis=0 ) for ll in range(n1)] )
print(time()-ts)
ts = time()
d = n1**2
a_new = np.reshape( a_new, ( d, n2, n3 ) )
a_new = np.repeat( a_new[np.newaxis,:,:,:], n1, axis=0 )
print(time()-ts)
ts = time()
d2 = d*n1
a_new = np.reshape( a_new, ( d2, n2, n3 ))
a_new = np.reshape( a_new, ( d2*n2 , n3 ) )
print(time()-ts)
When n1,n2,n3 are becoming quite large, this becomes kind of inefficient. What is the best way to proceed?
CodePudding user response:
Your code, tracking the size of the results:
In [22]: n1 = 5
...: n2 = 800
...: n3 = 1000
...:
...: a = np.random.rand(n1, n2, n3)
In [23]: a.shape
Out[23]: (5, 800, 1000)
In [24]: a_new = np.array([np.repeat(a[np.newaxis, ll, :, :], n1, axis=0) for ll in range(n1)])
...:
In [25]: a_new.shape
Out[25]: (5, 5, 800, 1000)
In [26]: d = n1**2
...: a1 = np.reshape(a_new, (d, n2, n3))
...: a1 = np.repeat(a1[np.newaxis, :, :, :], n1, axis=0)
...:
In [27]: a1.shape
Out[27]: (5, 25, 800, 1000)
In [28]: d2 = d * n1
...: a2 = np.reshape(a1, (d2, n2, n3))
...: a2 = np.reshape(a2, (d2 * n2, n3))
...:
...:
In [29]: a2.shape
Out[29]: (100000, 1000)
A minor point, but the double reshape
in a2
isn't necessary. YOu can go directly to the last shape. In any case, reshape is fast (unless it has to force a copy, as after a transpose).
Some times
In [34]: %timeit a_new = np.array([np.repeat(a[np.newaxis, ll, :, :], n1, axis=0) for ll in range(n1)])
263 ms ± 11.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [36]: a1 = np.reshape( a_new, ( d, n2, n3 ) )
In [37]: %timeit np.repeat( a_new[np.newaxis,:,:,:], n1, axis=0 )
912 ms ± 134 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
So yes, the repeat does take the most time. But it's increasing the size of a large array 5x. repeat
is relatively fast - for something that makes a large array.
Your a_new
list comprehension isn't necessary:
In [41]: np.repeat(a[:, None, :, :], n1, axis=1).shape
Out[41]: (5, 5, 800, 1000)
In [42]: np.allclose(np.repeat(a[:, None, :, :], n1, axis=1), a_new)
Out[42]: True
In [43]: %timeit np.repeat(a[:, None, :, :], n1, axis=1)
210 ms ± 23.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
The time savings are modest, since the n1
loop is only 5x.
Note that the following repeat
is takes about 5x. a_new
is 5x the size of a
, and a1
25 times the size of a
.
We could do it all in one line
In [44]: a4 = np.repeat(np.repeat(a[:,None,:,:], n1, axis=1)[None],n1, axis=0).reshape(-1,n3)
In [45]: a4.shape
Out[45]: (100000, 1000)
In [46]: np.allclose(a2, a4)
Out[46]: True
It doesn't change the time much. Interestingly the [46] allclose takes the most noticeable time.
The double repeat can be written with tile
:
a5 = np.tile(a[None,None,:,:],(5,5,1,1)).reshape(-1,n3)
It doesn't save time, because tile
uses repeat
repeatedly on each dimension.
Or even
a6 = np.repeat(a[None,:,:], n1*n1,axis=0).reshape(-1,n3)
I'm not going to time or test this since creating several arrays of this size was pushing my memory limits.