I have a sample array
import numpy as np
a = np.array(
[
[1, 2, 3],
[4, 5, 6],
[7, 8, 9],
[10, 11, 12],
[13, 14, 15],
]
)
And an array of indices for which I would like to get averages from
b = np.array([[1,3], [1,2], [2,3]])
In addition, I need the final result to have the first row concatenated to each of these averages
I can get the desired result using this
np.concatenate( (np.tile(a[0],(3,1)), a[b].mean(1)), axis=1)
array([[ 1. , 2. , 3. , 7. , 8. , 9. ],
[ 1. , 2. , 3. , 5.5, 6.5, 7.5],
[ 1. , 2. , 3. , 8.5, 9.5, 10.5]])
I am wondering if there is a more computationally efficient way, as I've heard concatenate is slow
Numpy concatenate is slow: any alternative approach?
I'm thinking there might be a way with a combinatin of advanced indexing, .mean()
, and reshape, but I am not able to come up with anything that gives the desired array.
CodePudding user response:
The problem is not that concatenate
is slow. In fact, it is not so slow. The problem is to use it in a loop so to produce a growing array. This pattern is very inefficient because it produces many temporary array and copies. However, in your case you do not use such a pattern so this is fine. Here, concatenate
is properly used and perfectly match with your intent. You could create an array and fill the left and the right part separately, but this is what concatenate
should do in the end. That being said, concatenate
has a quite big overhead mainly for small arrays (like most Numpy functions) because of many internal checks** (so to adapt its behaviour regarding the shape of the input arrays). Moreover, the implicit casting from np.int_
to np.float64
of np.tile(a[0],(3,1))
introduces another overhead. Moreover, note that mean
is not very optimized for such a case. It is faster to use (a[b[:,0]] a[b[:,1]]) * 0.5
although the intent is less clear.
n, m = a.shape[1], b.shape[0]
res = np.empty((n, m*2), dtype=np.float64)
res[:,m] = a[0] # Note: implicit conversion done here
res[:,m:] = (a[b[:,0]] a[b[:,1]]) * 0.5 # Also here
The resulting operation is about 3 times faster on my machine with your example. It may not be the case for big input arrays (although I expect a speed up too).
For big arrays, the best solution is to use a Numba (or Cython) code with loops so to avoid the creation/filling of big expensive temporary arrays. Numba should also speed up the computation of small arrays because it mostly removes the overhead of Numpy functions (I expect a speed up of about 5x-10x here).