Numpy: improve arrays operations-CodePudding

As an example, I have the 2 following 1d-arrays:

import numpy as np

a = np.array([1, 2, 3, 4])
b = np.array([5, 6])

Now, I need to multiply a for each element of b in order to obtain a 2d-array:

[[5 10 15 20],
 [6 12 18 24]]

Now, I solved the problem by creating 2 new 2d-arrays by repeating either rows or columns and then performing the multiplication, i.e.:

a_2d = np.repeat(a[np.newaxis, :], b.size, axis=0)
b_2d = np.tile(b[:, np.newaxis], a.size)

print(a_2d*b_2d)
#[[ 5 10 15 20]
# [ 6 12 18 24]]

Is there a way to make this operation more efficient?

This would not be limited to multiplications only but, ideally, applicable to all 4 operations. Thanks a lot!

CodePudding user response：

Use broadcasting:

>>> a * b[:, np.newaxis]
array([[ 5, 10, 15, 20],
       [ 6, 12, 18, 24]])
>>> a   b[:, np.newaxis]
array([[ 6,  7,  8,  9],
       [ 7,  8,  9, 10]])

CodePudding user response：

Another method is using the powerful einsum function, e.g.:

np.einsum("i,j->ji", a, b)

See, e.g., here for a great description of the function.

CodePudding user response：

Use numpy.outer

np.outer(b, a)

# [[ 5 10 15 20]
#  [ 6 12 18 24]]

CodePudding user response：

Comparative answers:

For your sample:

# einsum
%timeit np.einsum("i,j->ji", a, b)
2.1 µs ± 34.5 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

# outer
%timeit np.outer(b, a)
1.94 µs ± 10.2 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

# broadcast
%timeit a * b[:, np.newaxis]
938 ns ± 10.6 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

For a -> 10000, b -> 500

# einsum
%timeit np.einsum("i,j->ji", a, b)
7.02 ms ± 149 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# outer
%timeit np.outer(b, a)
4.62 ms ± 224 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

# broadcast
%timeit a * b[:, np.newaxis]
4.6 ms ± 121 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)