NumPy broadcasting vs. simple loop-CodePudding

I have been using NumPy for a while but there still are instances in which broadcasting is impenetrable to me. Please see the code below:

import numpy as np

np.random.seed(123)

# Number of measurements for the x variable
nx = 7
# Number of measurements for the y variable
ny = 11
# Number of items for which we run the simulation
nc = 23

# Fake some data
x = np.random.uniform(0, 1, size=(nx, ))
y = np.random.uniform(0, 1, size=(ny, ))

# histogram_2d represents the 2d frequency of the x, y measurements
histogram_2d = np.random.randint(0, 20, size=(nx, ny))

# c is the actual simulation results, size=(nx*ny, nc)
c = np.random.uniform(0, 9, size=(nx*ny, nc))

# Try broadcasting
c_3d = c.reshape((nc, nx, ny))
numpy_sum = (c_3d * histogram_2d).sum()

# Attempt possible replacement with a simple loop
partial_sum = 0.0
for i in range(nc):
    c_2d = np.reshape(c[:, i], (nx, ny))
    partial_sum  = (c_2d * histogram_2d).sum()
    
print('Numpy broadcasting:  ', numpy_sum)
print('Actual loop       :  ', partial_sum)

In my naivete, I was expecting the two approaches to give the same results (up to some multiple of machine precision). But on my system I get this:

Numpy broadcasting:   74331.4423599
Actual loop       :   73599.8596346

As my ignorance is showing: given that histogram_2d is a 2D array and c_3d is a 3D array, I was simply thinking that NumPy would magically expand histogram_2d with nc copies of itself in the first axis and do the multiplication. But it appears I am not quite correct.

I would like to know how to replace the condensed, broadcasted multiplication sum with a proper for loop - I am looking at some code in Fortran to do the same and this Fortran code:

hist_3d = spread(histogram_2d, 1, nc)
c_3d = reshape(c, [nc, nx, ny])
partial_sum = sum(c_3d*hist_3d)

Does not do what the NumPy broadcasting does... Which means I am doing something fundamentally wrong somewhere and/or my understanding of broadcasting is still very limited.

CodePudding user response：

In [3]: c.shape
Out[3]: (77, 23)

This isn't a good reshape; it works, but will mess up the layout

In [5]: c_3d = c.reshape((nc, nx, ny)); c_3d.shape
Out[5]: (23, 7, 11)

This is good - splitting the 77 into 7 and 11:

In [6]: c_3d = c.reshape((nx, ny,nc)); c_3d.shape
Out[6]: (7, 11, 23)

To multiply with:

In [7]: histogram_2d.shape
Out[7]: (7, 11)

Use:

In [8]: (histogram_2d[:,:,None]*c_3d).shape
Out[8]: (7, 11, 23)

In [9]: (histogram_2d[:,:,None]*c_3d).sum()
Out[9]: 73599.85963455029

With this broadcasting

 (7,11,1) and (7,11,23) => (7,11,23)

The 2 key rules are:

 add leading dimensions as need to match total ndim
 change all size 1 dimensions to match

(I used change, because the 1 may actually be changed to 0. That's not a common case, but illustrates the generality of broadcasting. )

New trailing dimensions have to be explicit. This avoids some ambiguities, as when trying to add a (2,) and (3,). One of those can be expanded to (1,2) or (1,3), but which? If one is (3,1), then expanding the other to (1,2) is unambiguous.