I want to calculate gradients for both trainable and non-trainable variables.
And update only trainable parameters.
At first, I implemented it as follows
with tf.GradientTape(persistent = True) as g:
preds = model(data)
loss = criterion(labels, preds)
gradients = g.gradient(loss, model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
non_train_gradients = g.gradient(loss, model.non_trainable_variables)
However, the above code do twice backpropagation to calculate gradients.
I want to estimate the gradients of both trainable and non-trainable variables simultaneosuly,
but only updates trainable parameters.
How can I do it?
CodePudding user response:
We can use the fact that the gradients are just a list and are returned in the same order as the variables we put in:
n_trainable = len(model.trainable_variables)
gradients = g.gradient(loss, model.trainable_variables model.non_trainable_variables)
trainable_gradients = gradients[:n_trainable]
non_trainable_gradients = gradients[n_trainable:]
optimizer.apply_gradients(zip(trainable_gradients, model.trainable_variables))
That is, we just put all the non-trainable variables at the end, and then split the gradients at that point.