Suppose the following numpy array:
arr = np.array([0, 1, 2, 3, 4]) # can be any array
I want to know the fastest way to generate the following operation:
n = arr.shape[0]
result = np.tile(arr, (n, 1)) - arr.reshape((-1, 1))
print(result):
array([[ 0, 1, 2, 3, 4],
[-1, 0, 1, 2, 3],
[-2, -1, 0, 1, 2],
[-3, -2, -1, 0, 1],
[-4, -3, -2, -1, 0]])
(1) How to efficiently create matrix "result" (because n >> 0 can be very large) ?
(2) Does this matrix have a particular name ?
CodePudding user response:
This is a bit faster:
result = arr-arr[:,None]
cursory benchmarks, nothing scientific. (timeit 100 times with arr):
5 items (arr) 100 times 10,000 items (np.arange) once
OP: 0.0006383560000000066 0.7902513520000001
This one: 0.0001735200000000381 0.3640661519999999
Kelly's: 0.00027326299999996806 0.36036748900000015 (see comments)
CodePudding user response:
Comparing the broadcast to scipy.linalg.toeplitz
and OP:
N = 1000
arr = np.arange(N)
%timeit np.tile(arr, (N, 1)) - arr.reshape((-1, 1))
1.83 ms ± 53.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit arr - arr[:, None]
1.11 ms ± 25 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit scipy.linalg.toeplitz(-arr, arr)
727 µs ± 21.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
N = 10000
arr = np.arange(N)
%timeit np.tile(arr, (N, 1)) - arr.reshape((-1, 1))
184 ms ± 9.27 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit arr - arr[:, None]
85.3 ms ± 597 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit scipy.linalg.toeplitz(-arr, arr)
70.6 ms ± 573 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
It seems that the scipy solution is generally faster.