Home > Net >  Python (NumPy): Memory efficient array multiplication with fancy indexing
Python (NumPy): Memory efficient array multiplication with fancy indexing

Time:12-06

I'm looking to do fast matrix multiplication in python, preferably NumPy, of an array A with another array B of repeated matrices by using a third array I of indices. This can be accomplished using fancy indexing and matrix multiplication:

from numpy.random import rand, randint

A = rand(1000,5,5)
B = rand(40000000,5,1)
I = randint(low=0, high=1000, size=40000000)

A[I] @ B

However, this creates the intermediate array A[I] of shape (40000000, 5, 5) which overflows the memory. It seems highly inefficient to have to repeat a small set of matrices for multiplication, and this is essentially a more general version of broadcasting such as A[0:1] @ B which has no issues.

Are there any alternatives?

I have looked at NumPy's einsum function but have not seen any support for utilizing an index vector in the call.

CodePudding user response:

If you're open to another package, you could wrap it up with dask.

from numpy.random import rand, randint
from dask import array as da

A = da.from_array(rand(1000,5,5))
B = da.from_array(rand(40000000,5,1))
I = da.from_array(randint(low=0, high=1000, size=40000000))

fancy = A[I] @ B

fancy desc

After finished manipulating, then bring it into memory using fancy.compute()

  • Related