How to remove a specific neuron inside Model Tensorflow Keras-CodePudding

Is there a way to remove a specific neuron inside a model?

For example, i have a model with a Dense layer with 512 neurons. Is there a way to remove all the neurons that have their indices inside list_indeces? Ofcourse removing a Neuron will affect the next layer and even the one before.

Example:

I have this common model present in multiple papers:

data_format = 'channels_last'
    input_shape = [28, 28, 1]
    max_pool = functools.partial(
        tf.keras.layers.MaxPooling2D,
        pool_size=(2, 2),
        padding='same',
        data_format=data_format)
    conv2d = functools.partial(
        tf.keras.layers.Conv2D,
        kernel_size=5,
        padding='same',
        data_format=data_format,
        activation=tf.nn.relu)
    model = tf.keras.models.Sequential([
        conv2d(filters=32, input_shape=input_shape),
        max_pool(),
        conv2d(filters=64),
        max_pool(),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(512, activation=tf.nn.relu),
        tf.keras.layers.Dense(10 if only_digits else 62),
    ])
    return model

Let's say that from the layer tf.keras.layers.Dense(512, activation=tf.nn.relu) i want to remove 100 neurons, basically turning them off.

Ofcourse i will have a new model with the layer tf.keras.layers.Dense(412, activation=tf.nn.relu) instead of tf.keras.layers.Dense(512, activation=tf.nn.relu) but this modification should be propagate to the weights of the next layer too, because the connection from the neurons of the dense layer to the next one are deleted too.

Any input on how to do so? I could do this manually by doing something like this:

The model shape is this one if i get it correctly: [5, 5, 1, 32], [32], [5, 5, 32, 64], [64], [3136, 512], [512], [512, 62], [62]

So i can do something like this:

Generate all the indices i need and same them inside list_indices
Access the weights of the layer tf.keras.layers.Dense(512, activation=tf.nn.relu), and create a tensor with all the weights that are inside list_indices
Assign the new tensor of weights to the layer tf.keras.layers.Dense(412, activation=tf.nn.relu) of the submodel

The problem is that i don't know how to get the correct weights of the next layer of weights that corrispond to the indices of the weights i just created and the one that i should assign to the next layer of the submodel. I hope i have exaplained myself clearly.

Thanks, Leyla.

CodePudding user response：

Your operation is known in the literature as selective dropout, there is no actual needing to create every time a different model, you just need to multiply the output of your selected neurons by 0, such that the input of the next layer does not take those activations in account.

Note that if you "turn off" a neuron in the layer Ln it does not "turn off" completely any neuron in the layer Ln 1, supposing that both are fully-connected layers (dense): each neuron in the Ln 1 layer is connected to ALL the neurons in the previous layer. In other words, removing a neuron in a fully-connected (dense) layer does not affect the dimension of the next one.

You can simply implement this operation with the Multiply Layer (Keras). The drawback is that you need to learn how to use the Keras functional API. There are other ways but are more complex than this (eg. custom layer), also functional APIs are very useful and powerful in many aspects, very suggested reading!

Your model would become like this:

data_format = 'channels_last'
input_shape = [28, 28, 1]
max_pool = ...
conv2d = ...

# convert a list of indexes to a weight tensor
def make_index_weights(indexes):
    # converting indexes to a list of weights
    indexes = [ float(i not in indexes) for i in range(units) ]
    # converting indexes from list/numpy to tensor
    indexes = tf.convert_to_tensor(indexes)
    # reshaping to the correct format
    indexes = tf.reshape(indexes, (1, units))
    # ensuring it is a float tensor
    indexes = tf.cast(indexes, 'float32')
    return indexes

# layer builder utility
def selective_dropout(units, indexes, **kwargs):
    indexes = make_index_weights(indexes)
    dense = tf.keras.layers.Dense(units, **kwargs)
    mul = tf.keras.layers.Multiply()
    # return the tensor builder
    return lambda inputs: mul([ dense(inputs), indexes ])

input_layer = tf.keras.layers.Input(input_shape)
conv_1  = conv2d(filters=32, input_shape=input_shape)(input_layer)
maxp_1  = max_pool()(conv_1)
conv_2  = conv2d(filters=64)(maxp_1)
maxp_2  = max_pool()(conv_2)
flat    = tf.keras.layers.Flatten()(maxp_2)
sel_drop_1 = selective_dropout(512, INDEXES, activation=tf.nn.relu)(flat)
dense_2 = tf.keras.layers.Dense(10 if only_digits else 62)(sel_drop_1)
output_layer = dense2
model = tf.keras.models.Model([ input_layer ], [ output_layer ])
return model

Now you just need to build up your INDEXES list according to the indexes of those neurons you need to remove.

In your case, the tensor would have a shape of 1x512 because there are 512 weights (units/neurons) in the dense layer, so you need to provide as much weights for the indexes. The selective_dropout function allows to pass a list of indexes to discard, and automatically will build up the desired tensor.

For example if you want to remove the neurons 1, 10, 12 you just pass the list [1, 10, 12] to the function and it will produce a 1x512 tensor with 0.0 at those positions, and 1.0 in all the others.

EDIT:

As you mentioned you strictly need to reduce the size of the parameters in your model.

Each dense layer is described by the relation y = Wx B, where W is the kernel (or weights matrix) and B is the bias vector. W is a matrix of INPUTxOUTPUT dimensions, where INPUT is the last layer output shape and OUTPUT is the number of neurons/units/weights in the layer; B is just a vector of dimension 1xOUTPUT (but we are not interested in this).

The problem now is that you are dropping N neurons in the layer Ln and this induce the drop of NxOUTPUT weights in the layer Ln 1. Let's be pratic with some numbers. In your case (supposing only_digits as true) you start with:

Nx512 -> 512x10 (5120 weights)

And after dropping 100 neurons (it means a drop of 100*10=1000 weights)

Nx412 -> 412x10 (4120 weights)

Now each column of the W matrix describe a neuron (as a vector of weights with a dimension equal to the previous layer output dimension, in our case 512 or 412). The rows of the matrix represent instead a single neuron in the previous layer.

The W[0,0] indicates the relation between the first neuron of layer n and the first of layer n 1.

W[0,0] -> 1st n, 1st n 1
W[0,1] -> 2nd n, 1st n 1
W[1,0] -> 1st n, 2nd n 1

And so on. So you could just remove from this matrix all the rows that are related to the neuron indexes you removed: index 0 -> row 0.

You can access the W matrix as a tensor from the dense layer with dense.kernel