The following gradient descent is failing 'coz the gradients returned by tape.gradient()
are none when the loop runs second time.
w = tf.Variable(tf.random.normal((3, 2)), name='w')
b = tf.Variable(tf.zeros(2, dtype=tf.float32), name='b')
x = tf.constant([[1., 2., 3.]])
for i in range(10):
print("iter {}".format(i))
with tf.GradientTape() as tape:
#forward prop
y = x @ w b
loss = tf.reduce_mean(y**2)
print("loss is \n{}".format(loss))
print("output- y is \n{}".format(y))
#vars getting dropped after couple of iterations
print(tape.watched_variables())
#get the gradients to minimize the loss
dl_dw, dl_db = tape.gradient(loss,[w,b])
#descend the gradients
w = w.assign_sub(0.001*dl_dw)
b = b.assign_sub(0.001*dl_db)
iter 0
loss is
23.328645706176758
output- y is
[[ 6.8125362 -0.49663293]]
(<tf.Variable 'w:0' shape=(3, 2) dtype=float32, numpy=
array([[-1.3461215 , 0.43708783],
[ 1.5931423 , 0.31951016],
[ 1.6574576 , -0.52424705]], dtype=float32)>, <tf.Variable 'b:0' shape=(2,) dtype=float32, numpy=array([0., 0.], dtype=float32)>)
iter 1
loss is
22.634033203125
output- y is
[[ 6.7103477 -0.48918355]]
()
TypeError Traceback (most recent call last)
c:\projects\pyspace\mltest\test.ipynb Cell 7' in <cell line: 1>()
11 dl_dw, dl_db = tape.gradient(loss,[w,b])
13 #descend the gradients
---> 14 w = w.assign_sub(0.001*dl_dw)
15 b = b.assign_sub(0.001*dl_db)
TypeError: unsupported operand type(s) for *: 'float' and 'NoneType'
I checked the documentation which explains the possibilities of the gradients becoming None
, but none of them are helping.
CodePudding user response:
This is because assign_sub
returns a Tensor
. In the line w = w.assign_sub(0.001*dl_dw)
you are thus overwriting w
with a tensor with the new value. Thus, in the next step, it is not a Variable
anymore and is not tracked by the gradient tape by default. This results in the gradient becoming None
(tensors also do not have the assign_sub
method, so that would crash as well).
Instead, simply write w.assign_sub(0.001*dl_dw)
and same for b
. The assign functions work in place, so no assignment is necessary.