Since we know the automatic differentiation is achieved by tf.GradientTape
in Python, like:
with tf.GradientTape(persistent=True) as tape1:
func_1 = u(x, y)
d_fun1_dx, d_fun1_dy = tape1.gradient(func_1, [x, y])
del tape1
it could get the derivative of a single output neural network.
And i have an neural network with two inputs x, y and two outputs f1, f2. I want to get df1/dx, df1/dy, df2/dx, df2/dy, how can i achieve this?
CodePudding user response:
What you are looking for is a Jacobian, not a gradient. It is implemented in tf under tape1.jacobian
and will return a jacobian matrix of partial derivatives.
Example from the documentation:
with tf.GradientTape() as g:
x = tf.constant([1.0, 2.0])
g.watch(x)
y = x * x
jacobian = g.jacobian(y, x)
# jacobian value is [[2., 0.], [0., 4.]]
That being said, use of Jacobians usually requires more advanced methods, what are you planning to do with it really will guide you if you really need a Jacobian. For example if you were to simply use "gradient descent" you need to now make a decision what to do with 2 gradients per parameter. Are you going to analyse them? Are you just going to add them? If you were to add them than note that
(dy/dx) f(x) (dz/dx) f(x) = (d/dx) [ f.z(x) f.y(x) ]
so it is equivalent to just adding outputs and computing normal gradient. There are of course uses of Jacobians but they go much beyond typical gradient descent algorithms.