Custom 2D matrix operation using numpy-CodePudding

I have two sparse binary matrices A and B that have matching dimensions, e.g. A has shape I x J and B has shape J x K. I have a custom operation that results in a matrix C of shape I x J x K, where each element (i,j,k) is 1 only if A(i,j) = 1 and B(j,k) = 1. I have currently implemented this operation as follows:

import numpy as np

I = 2
J = 3
K = 4

A = np.random.randint(2, size=(I, J))
B = np.random.randint(2, size=(J, K))

# Custom method
C = np.zeros((I,J,K))
for i in range(I):
    for j in range(J):
        for k in range(K):
            if A[i,j] == 1 and B[j,k] == 1:
                C[i,j,k] = 1

print(C)

However, the for loop is quite slow for large I,J,K. Is it possible to achieve this operation using numpy methods only to speed it up? I have looked at np.multiply.outer, but no success so far.

CodePudding user response：

Here you go:

C = np.einsum('ij,jk->ijk', A,B)

CodePudding user response：

Try to do what you're already doing with numba. Here's an example using your code, Sehan2's method and numba:

import numpy as np
from numba import jit, prange

I = 2
J = 3
K = 4

np.random.seed(0)

A = np.random.randint(2, size=(I, J))
B = np.random.randint(2, size=(J, K))

# Custom method
def Custom_method(A, B):
    I, J = A.shape
    J, K = B.shape
    C = np.zeros((I,J,K))
    for i in range(I):
        for j in range(J):
            for k in range(K):
                if A[i,j] == 1 and B[j,k] == 1:
                    C[i,j,k] = 1
    return C

def Custom_method_ein(A, B):
    C = np.einsum('ij,jk->ijk', A,B)
    return C

@jit(nopython=True)
def Custom_method_numba(A, B):
    I, J = A.shape
    J, K = B.shape
    C = np.zeros((I,J,K))
    for i in prange(I):
        for j in prange(J):
            for k in prange(K):
                if A[i,j] == 1 and B[j,k] == 1:
                    C[i,j,k] = 1
    return C

print('original')
%timeit Custom_method(A, B)
print('einsum')
%timeit Custom_method_ein(A, B)
print('numba')
%timeit Custom_method_numba(A, B)

Output:

original
10000 loops, best of 5: 18.8 µs per loop
einsum
The slowest run took 20.51 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 3.32 µs per loop
numba
The slowest run took 15.99 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 5: 815 ns per loop

Note that you can make your code run much faster and more efficiently if you use sparse matrix representations. That way you avoid performing unnecessary operations.