I'm looking to do fast matrix multiplication in python, preferably NumPy, of an array A with another array B of repeated matrices by using a third array I of indices. This can be accomplished using fancy indexing and matrix multiplication:
from numpy.random import rand, randint
A = rand(1000,5,5)
B = rand(40000000,5,1)
I = randint(low=0, high=1000, size=40000000)
A[I] @ B
However, this creates the intermediate array A[I]
of shape (40000000, 5, 5) which overflows the memory. It seems highly inefficient to have to repeat a small set of matrices for multiplication, and this is essentially a more general version of broadcasting such as A[0:1] @ B
which has no issues.
Are there any alternatives?
I have looked at NumPy's einsum function but have not seen any support for utilizing an index vector in the call.
CodePudding user response:
If you're open to another package, you could wrap it up with dask.
from numpy.random import rand, randint
from dask import array as da
A = da.from_array(rand(1000,5,5))
B = da.from_array(rand(40000000,5,1))
I = da.from_array(randint(low=0, high=1000, size=40000000))
fancy = A[I] @ B
After finished manipulating, then bring it into memory using fancy.compute()