Home > database >  Override broadcasting conventions in Numpy?
Override broadcasting conventions in Numpy?

Time:12-17

I have two Numpy arrays, x and y, where x is n-dimensional with n>=1 but unknown at "compile time", whereas y is one-dimensional. The first dimension of x is guaranteed to match the first (and only) dimension of y. I would like get their "sum", call it s, with the same dimension of x as follows:

import numpy as np

x  = np.random.randn(5,100,10)
y  = np.random.randn(5)

s = np.empty_like(x)
for i in range(x.shape[0]):
   s[i] = x[i]   y[i]

But I would like to avoid the for loop both for readability and, more importantly, for speed reasons.

Obviously because of how the broadcasting conventions in Numpy work, I cannot do x y. This would either throw an error or, worse, would work by coincidence giving an unintended result.

I found two relatively nice one liners,

s1 = (x.T   y).T
s2 = x   y[(slice(0, y.shape[0]),)   (np.newaxis,)*(x.ndim-1)]

which are not bad. The first one is exploiting the fact that it's indeed the first dimension in x that matches the only dimension in y. It would not work otherwise. The second is more general, but it's quite verbose.

Since I am still in the process of learning Python and Numpy, I would like to know if there are other (ideally better, but I am also interested in general) alternatives to do what I want to do. Essentially what I am maybe looking for is a way to override the broadcasting conventions...

CodePudding user response:

You can't change the broadcasting rules. So one way or other you have to add trailing dimensions to y.

You've used newaxis, producing:

In [9]: y[:,None,None].shape
Out[9]: (5, 1, 1)

Constructing a similar tuple for reshape may be a bit simpler:

In [10]: y.reshape((-1,1,1)).shape
Out[10]: (5, 1, 1)

expand_dims is another way of specifying the reshape:

In [11]: np.expand_dims(y,(1,2)).shape
Out[11]: (5, 1, 1)

None of these are computationally expensive, even if the code ends up a bit wordy.

There are several atleast functions, but it doesn't help:

In [19]: np.atleast_3d(y).shape
Out[19]: (1, 5, 1)

Still, looking at the code for expand_dims or this atleast may give you ideas of how to add dimensions. One way or other they use newaxis or reshape.

You can also specify dimensions with np.array, but that adds the leading dimensions:

In [22]: np.array(y, ndmin=3, copy=False).shape
Out[22]: (1, 1, 5)

edit

Using x.ndim:

In [30]: dim=[1]*x.ndim; dim[0]=-1;y.reshape(dim).shape
Out[30]: (5, 1, 1)
In [44]: y.reshape((-1,) (1,)*(x.ndim-1)).shape
Out[44]: (5, 1, 1)

In [33]: np.expand_dims(y,tuple(np.arange(1,x.ndim))).shape
Out[33]: (5, 1, 1)
In [36]: np.expand_dims(y,list(range(1,x.ndim))).shape
Out[36]: (5, 1, 1)

Your version, slightly simplified:

In [45]: y[((slice(None),) (None,)*(x.ndim-1))].shape
Out[45]: (5, 1, 1)

This is, in timings, the fastest.

  • Related