I have a stochastic process with Mmax trajectories. For each trajectory, I have to take the dot product of two matrices, A and B. With a loop, it works great
A=np.zeros((2,Mmax),dtype=np.complex64)
B=np.zeros((2,2,Mmax),dtype=np.complex64)
C=np.zeros((2,Mmax),dtype=np.complex64)
for m in range(Mmax):
C[:,m]=B[:,:,m].dot(A[:,m])
(here are just 2x2 matrices to simplify it, when in reality they are much larger) However, this loop is slow for a large number of trajectories. I want to optimize it by vectorizing it, and I have some problems when I try to implement it
B[:,:,:].dot(A[:,:])
It gives me the error 'shapes (2,2,10) and (2,10) not aligned: 10 (dim 2) != 2 (dim 0)', which makes sense. However, I would really need to vectorize this process, or at least optimize it as much as possible.
Is there any way to get this?
CodePudding user response:
If speed is your concern, there is a way to have that multiplication non-vectorised and yet extremely fast - usually even significantly faster than that. It needs numba though:
import numpy as np
import numba as nb
@nb.njit
def mat_mul(A, B):
n, Mmax = A.shape
C = np.zeros((n, Mmax))
for m in range(Mmax):
for j in range(n):
for i in range(n):
C[j, m] = B[j, i, m]*A[i, m]
return C
Mmax = 100
A = np.ones((2, Mmax))
B = np.ones((2, 2, Mmax))
C = mat_mul(A, B)
CodePudding user response:
Define sample arrays that aren't all zeros. We want to verify values as well as shapes.
In [82]: m = 5
In [83]: A = np.arange(2*m).reshape(2,m)
In [84]: B = np.arange(2*2*m).reshape(2,2,m)
Your iteration:
In [85]: C = np.zeros((2,m))
In [86]: for i in range(m):
...: C[:,i]=B[:,:,i].dot(A[:,i])
...:
In [87]: C
Out[87]:
array([[ 25., 37., 53., 73., 97.],
[ 75., 107., 143., 183., 227.]])
It's fairly easy to express that in einsum
:
In [88]: np.einsum('ijk,jk->ik',B,A)
Out[88]:
array([[ 25, 37, 53, 73, 97],
[ 75, 107, 143, 183, 227]])
matmul/@
is a variation on np.dot
that handles 'batches' nicely. But the batch dimension has to be first (of 3). Your batch dimension, m
, is last, so we have do some transposing to get the same result:
In [90]: (np.matmul(B.transpose(2,0,1),A.transpose(1,0)[...,None])[...,0]).T
Out[90]:
array([[ 25, 37, 53, 73, 97],
[ 75, 107, 143, 183, 227]])