I have two sparse binary matrices A
and B
that have matching dimensions, e.g. A
has shape I x J
and B
has shape J x K
. I have a custom operation that results in a matrix C
of shape I x J x K
, where each element (i,j,k)
is 1
only if A(i,j) = 1 and B(j,k) = 1
. I have currently implemented this operation as follows:
import numpy as np
I = 2
J = 3
K = 4
A = np.random.randint(2, size=(I, J))
B = np.random.randint(2, size=(J, K))
# Custom method
C = np.zeros((I,J,K))
for i in range(I):
for j in range(J):
for k in range(K):
if A[i,j] == 1 and B[j,k] == 1:
C[i,j,k] = 1
print(C)
However, the for loop is quite slow for large I,J,K
. Is it possible to achieve this operation using numpy methods only to speed it up? I have looked at np.multiply.outer
, but no success so far.
CodePudding user response:
Here you go:
C = np.einsum('ij,jk->ijk', A,B)
CodePudding user response:
Try to do what you're already doing with numba. Here's an example using your code, Sehan2's method and numba:
import numpy as np
from numba import jit, prange
I = 2
J = 3
K = 4
np.random.seed(0)
A = np.random.randint(2, size=(I, J))
B = np.random.randint(2, size=(J, K))
# Custom method
def Custom_method(A, B):
I, J = A.shape
J, K = B.shape
C = np.zeros((I,J,K))
for i in range(I):
for j in range(J):
for k in range(K):
if A[i,j] == 1 and B[j,k] == 1:
C[i,j,k] = 1
return C
def Custom_method_ein(A, B):
C = np.einsum('ij,jk->ijk', A,B)
return C
@jit(nopython=True)
def Custom_method_numba(A, B):
I, J = A.shape
J, K = B.shape
C = np.zeros((I,J,K))
for i in prange(I):
for j in prange(J):
for k in prange(K):
if A[i,j] == 1 and B[j,k] == 1:
C[i,j,k] = 1
return C
print('original')
%timeit Custom_method(A, B)
print('einsum')
%timeit Custom_method_ein(A, B)
print('numba')
%timeit Custom_method_numba(A, B)
Output:
original
10000 loops, best of 5: 18.8 µs per loop
einsum
The slowest run took 20.51 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 3.32 µs per loop
numba
The slowest run took 15.99 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 815 ns per loop
Note that you can make your code run much faster and more efficiently if you use sparse matrix representations. That way you avoid performing unnecessary operations.