I want to initialise orthogonal weights with Keras / Tensorflow 1.14
So, I am modifying the following code (which works fine):
def __dictNN(self, x):
# Parameters
dim = self.__dim
hdim = self.__hdim
ddim = self.__ddim
kmatdim = ddim 1 dim
num_layers = self.__num_layers
std = 1.0 / np.sqrt(hdim)
std_proj = 1.0 / np.sqrt(dim)
with tf.variable_scope("Input_projection",
initializer=tf.random_uniform_initializer(
maxval=std_proj, minval=-std_proj)):
P = tf.get_variable(name='weights',
shape=(dim,hdim),
dtype=tf.float64)
res_in = tf.matmul(x, P)
with tf.variable_scope("Residual"):
for j in range(self.__num_layers):
layer_name = "Layer_" str(j)
with tf.variable_scope(layer_name):
W = tf.get_variable(name="weights", shape=(hdim,hdim),
dtype=tf.float64)
b = tf.get_variable(name="biases", shape=(hdim),
dtype=tf.float64)
if j==0: # first layer
res_out = res_in self.__tf_nlr(
tf.matmul(res_in, W) b)
else: # subsequent layers
res_out = res_out self.__tf_nlr(
tf.matmul(res_out, W) b)
with tf.variable_scope("Output_projection",
initializer=tf.random_uniform_initializer(
maxval=std, minval=-std)):
W = tf.get_variable(name="weights", shape=(hdim, ddim),
dtype=tf.float64)
b = tf.get_variable(name="biases", shape=(ddim),
dtype=tf.float64)
out = tf.matmul(res_out, W) b
return out
So I am substituting the initializer=tf.random_uniform_initializer()
for initializer=tf.orthogonal_initializer()
, but it does not seam to work, the error that I have is:
ValueError: The tensor to initialize must be at least two-dimensional
I hope that you can help me
CodePudding user response:
According to tensorflow doc orthogonal initializer normalize a 2D tensor to be orthogonal. If the tensor is not squared, the columns or the rows (depending on which is fewer) will be orthogonal. When the rank is > 2 the tensor is reshaped to have 2 dimensions. However 1D tensor like b
, are not handled.
The trick: change your biases to 2D tensors:
b = tf.get_variable(name="biases", shape=(1, ddim),
dtype=tf.float64)
It should "orthogonalize" the row (here it means just normalize I guess). Note that I don't test it and it can raises an error because it expects more than 1 row.
Or more simpler, change the initialization for the biases.