Vectorized Matrix-Vector multiplication along specified axis of multidimensional array-CodePudding

Suppose I have an array A of size (M,N,D,D) and a vector x of size (M,N,D). I want to produce the array b of size (M,N,D) where b[m,n,:] = np.matmul(A[m,n,:,:],x[m,n,:]). So basically, the D-by-D matrix contained in A[m,n,:] and the vector of size D contained in x[m,n,:] should be multiplied together following standard matrix-vector multiplication. A naive for-loop implementation would be

for m in range(M):
    for n in range(N):
        b[m,n,:] = np.matmul(A[m,n,:],x[m,n,:])

Obviously this is inefficient for large N and M (which is the case for me; N is of the order of 100,000 and M is of the order of 1000). My question is: How to vectorize this loop? I've tried using np.tensordot but I keep getting shape mismatches no matter how I try to structure it, presumably because A has 4 axes whereas x has 3 (Note that for all I know, np.tensordot is the right function to use, I'm just not using it correctly)

EDIT Tried using np.einsum, but I don't get the expected result (which should be [5,11]:

CodePudding user response：

This is where np.einsum shines:

np.einsum('mnij,mnj->mni', A,B)

Check with

np.allclose(np.einsum('mnij,mnj->mni', A,x), b)

returns True

CodePudding user response：

The same result can be achieved with broadcasting:

x1 = x.reshape(M,N,d,1)
b1 = np.matmul(A,x1)
b2 = b1.reshape(M,N,d)