orthogonal weights in Network with Keras / Tensorflow-CodePudding

I want to initialise orthogonal weights with Keras / Tensorflow 1.14

So, I am modifying the following code (which works fine):

    def __dictNN(self, x):
        # Parameters
        dim = self.__dim
        hdim = self.__hdim
        ddim = self.__ddim
        kmatdim = ddim   1   dim
        num_layers = self.__num_layers
        std = 1.0 / np.sqrt(hdim)
        std_proj = 1.0 / np.sqrt(dim)
        with tf.variable_scope("Input_projection",
                               initializer=tf.random_uniform_initializer(
                                maxval=std_proj, minval=-std_proj)):
            P = tf.get_variable(name='weights',
                                shape=(dim,hdim),
                                dtype=tf.float64)
            res_in = tf.matmul(x, P)
        with tf.variable_scope("Residual"):
            for j in range(self.__num_layers):
                layer_name = "Layer_" str(j)
                with tf.variable_scope(layer_name):
                    W = tf.get_variable(name="weights", shape=(hdim,hdim),
                                        dtype=tf.float64)
                    b = tf.get_variable(name="biases", shape=(hdim),
                                        dtype=tf.float64)
                    if j==0: # first layer
                        res_out = res_in   self.__tf_nlr(
                            tf.matmul(res_in, W)   b)
                    else: # subsequent layers
                        res_out = res_out   self.__tf_nlr(
                            tf.matmul(res_out, W)   b)
        with tf.variable_scope("Output_projection",
                            initializer=tf.random_uniform_initializer(
                            maxval=std, minval=-std)):
            W = tf.get_variable(name="weights", shape=(hdim, ddim),
                            dtype=tf.float64)
            b = tf.get_variable(name="biases", shape=(ddim),
                            dtype=tf.float64)
            out = tf.matmul(res_out, W)   b
        return out

So I am substituting the initializer=tf.random_uniform_initializer() for initializer=tf.orthogonal_initializer() , but it does not seam to work, the error that I have is:

ValueError: The tensor to initialize must be at least two-dimensional

I hope that you can help me

CodePudding user response：

According to tensorflow doc orthogonal initializer normalize a 2D tensor to be orthogonal. If the tensor is not squared, the columns or the rows (depending on which is fewer) will be orthogonal. When the rank is > 2 the tensor is reshaped to have 2 dimensions. However 1D tensor like b, are not handled.

The trick: change your biases to 2D tensors:

b = tf.get_variable(name="biases", shape=(1, ddim),
                    dtype=tf.float64)

It should "orthogonalize" the row (here it means just normalize I guess). Note that I don't test it and it can raises an error because it expects more than 1 row.

Or more simpler, change the initialization for the biases.