Creating symmetric weight kernels for Dense layer tensorflow-CodePudding

I am trying to create a weight matrix in TensorFlow as shown below

[[a, b],
 [c, b], 
 [a, d]]

where a, b, c, d are Tf.Variable. The gradient flow should change only the variables a, b, c, d. If I initialize 3 variables separately like, and then create the weight matrix w as follows:

a = tf.Variable(1., name='a', trainable=True)
b = tf.Variable(1., name='b', trainable=True)
c = tf.Variable(1., name='c', trainable=True)
d = tf.Variable(1., name='d', trainable=True)
w = tf.Variable([[a, b], [c, b], [a, d]], name='w')

The gradient flow is not tied to the variable and tape.gradient tells None for a, b, c, d, whereas it shows gradient for w. You can check the gradients by

x = [[1., 2., 3.]]
with tf.GradientTape(persistent=True) as tape:
    y = x @ w
    loss = y
print(tape.gradient(loss, [w, a, b, c, d]))

Is there some way to initialize the weight matrix with such constraints?

CodePudding user response：

Try using a list instead of tf.Variable for w, since the nested variables do not seem to be recognized:

import tensorflow as tf

a = tf.Variable(1., name='a', trainable=True)
b = tf.Variable(1., name='b', trainable=True)
c = tf.Variable(1., name='c', trainable=True)
d = tf.Variable(1., name='d', trainable=True)
w = [[a, b], [c, b], [a, d]]

x = tf.constant([[1., 2., 3.]])
with tf.GradientTape(persistent=True) as tape:
    y = x @ w
    loss = y
print(*tape.gradient(loss, w), sep='\n')

[<tf.Tensor: shape=(), dtype=float32, numpy=4.0>, <tf.Tensor: shape=(), dtype=float32, numpy=3.0>]
[<tf.Tensor: shape=(), dtype=float32, numpy=2.0>, <tf.Tensor: shape=(), dtype=float32, numpy=3.0>]
[<tf.Tensor: shape=(), dtype=float32, numpy=4.0>, <tf.Tensor: shape=(), dtype=float32, numpy=3.0>]