In https://numpy.org/doc/stable/reference/generated/numpy.einsum.html it is mentioned that
Broadcasting and scalar multiplication: np.einsum('..., ...', 3, c) array([[ 0, 3, 6],[ 9, 12, 15]])
it seems einsum can mimick prefactors alpha/beta in DGEMM http://www.netlib.org/lapack/explore-html/d1/d54/group__double__blas__level3_gaeda3cbd99c8fb834a60a6412878226e1.html
Does it imply that it (include scalar multiplication inside einsum as one step) will be faster than two steps: (1) A,B->C
and (2) C*prefactor
?
I tried to extend https://ajcr.net/Basic-guide-to-einsum/ as
import numpy as np
A = np.array([0, 1, 2])
B = np.array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]])
C = np.einsum('i,ij->i', 2., A, B)
print(C)
and got ValueError: einstein sum subscripts string contains too many subscripts for operand
.
So, my question is, is there any method to include scalar factor inside einsum and accelerate the calculation?
CodePudding user response:
I haven't used this scalar
feature, but here's how it works:
In [422]: np.einsum('i,ij->i',A,B)
Out[422]: array([ 0, 22, 76])
In [423]: np.einsum(',i,ij->i',2,A,B)
Out[423]: array([ 0, 44, 152])
The time savings appears to be minor
In [424]: timeit np.einsum(',i,ij->i',2,A,B)
11.5 µs ± 271 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
In [425]: timeit 2*np.einsum('i,ij->i',A,B)
12.3 µs ± 274 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
another example:
In [427]: np.einsum(',i,,ij->i',3,A,2,B)
Out[427]: array([ 0, 132, 456])