I have two Numpy arrays, x
and y
, where x
is n-dimensional with n>=1 but unknown at "compile time", whereas y
is one-dimensional. The first dimension of x
is guaranteed to match the first (and only) dimension of y
. I would like get their "sum", call it s
, with the same dimension of x
as follows:
import numpy as np
x = np.random.randn(5,100,10)
y = np.random.randn(5)
s = np.empty_like(x)
for i in range(x.shape[0]):
s[i] = x[i] y[i]
But I would like to avoid the for loop both for readability and, more importantly, for speed reasons.
Obviously because of how the broadcasting conventions in Numpy work, I cannot do x y
. This would either throw an error or, worse, would work by coincidence giving an unintended result.
I found two relatively nice one liners,
s1 = (x.T y).T
s2 = x y[(slice(0, y.shape[0]),) (np.newaxis,)*(x.ndim-1)]
which are not bad. The first one is exploiting the fact that it's indeed the first dimension in x
that matches the only dimension in y
. It would not work otherwise. The second is more general, but it's quite verbose.
Since I am still in the process of learning Python and Numpy, I would like to know if there are other (ideally better, but I am also interested in general) alternatives to do what I want to do. Essentially what I am maybe looking for is a way to override the broadcasting conventions...
CodePudding user response:
You can't change the broadcasting rules. So one way or other you have to add trailing dimensions to y
.
You've used newaxis
, producing:
In [9]: y[:,None,None].shape
Out[9]: (5, 1, 1)
Constructing a similar tuple for reshape
may be a bit simpler:
In [10]: y.reshape((-1,1,1)).shape
Out[10]: (5, 1, 1)
expand_dims
is another way of specifying the reshape
:
In [11]: np.expand_dims(y,(1,2)).shape
Out[11]: (5, 1, 1)
None of these are computationally expensive, even if the code ends up a bit wordy.
There are several atleast
functions, but it doesn't help:
In [19]: np.atleast_3d(y).shape
Out[19]: (1, 5, 1)
Still, looking at the code for expand_dims
or this atleast
may give you ideas of how to add dimensions. One way or other they use newaxis
or reshape
.
You can also specify dimensions with np.array
, but that adds the leading dimensions:
In [22]: np.array(y, ndmin=3, copy=False).shape
Out[22]: (1, 1, 5)
edit
Using x.ndim
:
In [30]: dim=[1]*x.ndim; dim[0]=-1;y.reshape(dim).shape
Out[30]: (5, 1, 1)
In [44]: y.reshape((-1,) (1,)*(x.ndim-1)).shape
Out[44]: (5, 1, 1)
In [33]: np.expand_dims(y,tuple(np.arange(1,x.ndim))).shape
Out[33]: (5, 1, 1)
In [36]: np.expand_dims(y,list(range(1,x.ndim))).shape
Out[36]: (5, 1, 1)
Your version, slightly simplified:
In [45]: y[((slice(None),) (None,)*(x.ndim-1))].shape
Out[45]: (5, 1, 1)
This is, in timings, the fastest.