What does the following code do? (grad[range(m),y] -= 1
)
def delta_cross_entropy(X,y):
"""
X is the output from fully connected layer (num_examples x num_classes)
y is labels (num_examples x 1)
Note that y is not one-hot encoded vector.
It can be computed as y.argmax(axis=1) from one-hot encoded vectors of labels if required.
"""
m = y.shape[0]
grad = softmax(X)
# What does this do? Does it subtract y from grad? (As that is what is supposed to happen)
grad[range(m),y] -= 1
grad = grad/m
return grad
EDIT: This is not about how slices update the arrays they are from inplace, as y
is not a slice of grad
, this question is about the syntax of NumPy.
CodePudding user response:
grad[range(m),y] -= 1 # It is the same as subtracting X[i,j] when j==y[i].