I am trying to create a weight matrix in TensorFlow as shown below
[[a, b],
[c, b],
[a, d]]
where a, b, c, d are Tf.Variable. The gradient flow should change only the variables a, b, c, d. If I initialize 3 variables separately like, and then create the weight matrix w as follows:
a = tf.Variable(1., name='a', trainable=True)
b = tf.Variable(1., name='b', trainable=True)
c = tf.Variable(1., name='c', trainable=True)
d = tf.Variable(1., name='d', trainable=True)
w = tf.Variable([[a, b], [c, b], [a, d]], name='w')
The gradient flow is not tied to the variable and tape.gradient tells None for a, b, c, d, whereas it shows gradient for w. You can check the gradients by
x = [[1., 2., 3.]]
with tf.GradientTape(persistent=True) as tape:
y = x @ w
loss = y
print(tape.gradient(loss, [w, a, b, c, d]))
Is there some way to initialize the weight matrix with such constraints?
CodePudding user response:
Try using a list instead of tf.Variable
for w
, since the nested variables do not seem to be recognized:
import tensorflow as tf
a = tf.Variable(1., name='a', trainable=True)
b = tf.Variable(1., name='b', trainable=True)
c = tf.Variable(1., name='c', trainable=True)
d = tf.Variable(1., name='d', trainable=True)
w = [[a, b], [c, b], [a, d]]
x = tf.constant([[1., 2., 3.]])
with tf.GradientTape(persistent=True) as tape:
y = x @ w
loss = y
print(*tape.gradient(loss, w), sep='\n')
[<tf.Tensor: shape=(), dtype=float32, numpy=4.0>, <tf.Tensor: shape=(), dtype=float32, numpy=3.0>]
[<tf.Tensor: shape=(), dtype=float32, numpy=2.0>, <tf.Tensor: shape=(), dtype=float32, numpy=3.0>]
[<tf.Tensor: shape=(), dtype=float32, numpy=4.0>, <tf.Tensor: shape=(), dtype=float32, numpy=3.0>]