I'm trying to understand the logic for inferring the sematics of addition in numpy/torch. Here is an example that caused a bug in my program:
import numpy as np
x = np.arange(2) # shape (2,)
y = x.reshape(-1, 1) # shape (2, 1)
z = y x # expected (2,) or (2, 1)
print(z.shape) # (2, 2)
So basically reshape happened in unrelated operation, however x
and y
still had same number of elements, and I've expected to get resulting shape either [2,] or [2, 1] as the addition happens on the axis where all elements live.
My questions:
- why do I get [2,2] shape?
- What's bigger picture behind it that can help expect this outcome in similar, but different scenarios?
CodePudding user response:
This is caused by broadcasting, where the following example is given:
x = np.arange(4) # shape (4,)
xx = x.reshape(4,1) # shape (4,1)
y = np.ones(5) # shape (5,)
x y # ValueError: operands could not be broadcast together with shapes (4,) (5,)
xx y # shape (4, 5)
"When either of the dimensions compared is 1, the other is used. In other words, dimensions with size 1 are stretched or “copied” to match the other."