I have a code below:
import numpy as np
wtsarray # shape(5000000,21)
covmat # shape(21,21)
portvol = np.zeros(shape=(wtsarray.shape[0],))
for i in range(0, wtsarray.shape[0]):
portvol[i] = np.sqrt(np.dot(wtsarray[i].T, np.dot(covmat, wtsarray[i]))) * np.sqrt(mtx)
Nothing wrong with the above code, except that there's 5 million rows of row vector, and the for loop can be a little slow, I was wondering if you guys know of a way to vectorise it, so far I have tried with little success.
Or if there is any way to treat each individual row in a numpy matrix as a row vector and perform the above operation?
Thanks, if there are any suggestions on rephrasing my questions, please let me know as well.
CodePudding user response:
portvol = np.sqrt(np.sum(wtsarray * (wtsarray @ covmat.T), axis=1)) * np.sqrt(mtx)
should give you what you want. It replaces the first np.dot
with elementwise multiplication followed by summation and it replaces the second np.dot(covmat, wtsarray[i])
with matrix multiplication, wtsarray @ covmat.T
.
CodePudding user response:
For a smaller sample arrays:
In [24]: wtsarray = np.arange(15).reshape((5,3)); covmat=np.arange(9).reshape((3,3))
In [25]: portvol = np.zeros((5))
In [26]: for i in range(0, wtsarray.shape[0]):
...: portvol[i] = np.sqrt(np.dot(wtsarray[i], np.dot(covmat, wtsarray[i])))
...:
In [27]: portvol
Out[27]: array([ 7.74596669, 25.92296279, 43.95452195, 61.96773354, 79.97499609])
@ogdenkev's solution:
In [28]: np.sqrt(np.sum(wtsarray * (wtsarray @ covmat.T), axis=1))
Out[28]: array([ 7.74596669, 25.92296279, 43.95452195, 61.96773354, 79.97499609])
In [30]: timeit np.sqrt(np.sum(wtsarray * (wtsarray @ covmat.T), axis=1))
20.4 µs ± 891 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Same thing using einsum
:
In [29]: np.sqrt(np.einsum('ij,jk,ik->i',wtsarray,covmat,wtsarray))
Out[29]: array([ 7.74596669, 25.92296279, 43.95452195, 61.96773354, 79.97499609])
In [31]: timeit np.sqrt(np.einsum('ij,jk,ik->i',wtsarray,covmat,wtsarray))
12.9 µs ± 24.1 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
A matmul version is in the works
In [35]: np.sqrt(np.squeeze(wtsarray[:,None,:]@covmat@wtsarray[:,:,None]))
Out[35]: array([ 7.74596669, 25.92296279, 43.95452195, 61.96773354, 79.97499609])
In [36]: timeit np.sqrt(np.squeeze(wtsarray[:,None,:]@covmat@wtsarray[:,:,None]))
13.5 µs ± 15.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)