Is there a way to remove a specific neuron inside a model?
For example, i have a model with a Dense layer with 512 neurons. Is there a way to remove all the neurons that have their indices inside list_indeces
?
Ofcourse removing a Neuron will affect the next layer and even the one before.
Example:
I have this common model present in multiple papers:
data_format = 'channels_last'
input_shape = [28, 28, 1]
max_pool = functools.partial(
tf.keras.layers.MaxPooling2D,
pool_size=(2, 2),
padding='same',
data_format=data_format)
conv2d = functools.partial(
tf.keras.layers.Conv2D,
kernel_size=5,
padding='same',
data_format=data_format,
activation=tf.nn.relu)
model = tf.keras.models.Sequential([
conv2d(filters=32, input_shape=input_shape),
max_pool(),
conv2d(filters=64),
max_pool(),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation=tf.nn.relu),
tf.keras.layers.Dense(10 if only_digits else 62),
])
return model
Let's say that from the layer tf.keras.layers.Dense(512, activation=tf.nn.relu)
i want to remove 100 neurons, basically turning them off.
Ofcourse i will have a new model with the layer tf.keras.layers.Dense(412, activation=tf.nn.relu)
instead of tf.keras.layers.Dense(512, activation=tf.nn.relu)
but this modification should be propagate to the weights of the next layer too, because the connection from the neurons of the dense layer to the next one are deleted too.
Any input on how to do so? I could do this manually by doing something like this:
The model shape is this one if i get it correctly: [5, 5, 1, 32], [32], [5, 5, 32, 64], [64], [3136, 512], [512], [512, 62], [62]
So i can do something like this:
- Generate all the indices i need and same them inside
list_indices
- Access the weights of the layer
tf.keras.layers.Dense(512, activation=tf.nn.relu)
, and create a tensor with all the weights that are insidelist_indices
- Assign the new tensor of weights to the layer
tf.keras.layers.Dense(412, activation=tf.nn.relu)
of the submodel
The problem is that i don't know how to get the correct weights of the next layer of weights that corrispond to the indices of the weights i just created and the one that i should assign to the next layer of the submodel. I hope i have exaplained myself clearly.
Thanks, Leyla.
CodePudding user response:
Your operation is known in the literature as selective dropout
, there is no actual needing to create every time a different model, you just need to multiply the output of your selected neurons by 0, such that the input of the next layer does not take those activations in account.
Note that if you "turn off" a neuron in the layer Ln
it does not "turn off" completely any neuron in the layer Ln 1
, supposing that both are fully-connected layers (dense): each neuron in the Ln 1
layer is connected to ALL the neurons in the previous layer. In other words, removing a neuron in a fully-connected (dense) layer does not affect the dimension of the next one.
You can simply implement this operation with the Multiply Layer
(Keras). The drawback is that you need to learn how to use the Keras functional API. There are other ways but are more complex than this (eg. custom layer), also functional APIs are very useful and powerful in many aspects, very suggested reading!
Your model would become like this:
data_format = 'channels_last'
input_shape = [28, 28, 1]
max_pool = ...
conv2d = ...
# convert a list of indexes to a weight tensor
def make_index_weights(indexes):
# converting indexes to a list of weights
indexes = [ float(i not in indexes) for i in range(units) ]
# converting indexes from list/numpy to tensor
indexes = tf.convert_to_tensor(indexes)
# reshaping to the correct format
indexes = tf.reshape(indexes, (1, units))
# ensuring it is a float tensor
indexes = tf.cast(indexes, 'float32')
return indexes
# layer builder utility
def selective_dropout(units, indexes, **kwargs):
indexes = make_index_weights(indexes)
dense = tf.keras.layers.Dense(units, **kwargs)
mul = tf.keras.layers.Multiply()
# return the tensor builder
return lambda inputs: mul([ dense(inputs), indexes ])
input_layer = tf.keras.layers.Input(input_shape)
conv_1 = conv2d(filters=32, input_shape=input_shape)(input_layer)
maxp_1 = max_pool()(conv_1)
conv_2 = conv2d(filters=64)(maxp_1)
maxp_2 = max_pool()(conv_2)
flat = tf.keras.layers.Flatten()(maxp_2)
sel_drop_1 = selective_dropout(512, INDEXES, activation=tf.nn.relu)(flat)
dense_2 = tf.keras.layers.Dense(10 if only_digits else 62)(sel_drop_1)
output_layer = dense2
model = tf.keras.models.Model([ input_layer ], [ output_layer ])
return model
Now you just need to build up your INDEXES
list according to the indexes of those neurons you need to remove.
In your case, the tensor would have a shape of 1x512
because there are 512 weights (units/neurons) in the dense layer, so you need to provide as much weights for the indexes. The selective_dropout
function allows to pass a list of indexes to discard, and automatically will build up the desired tensor.
For example if you want to remove the neurons 1, 10, 12 you just pass the list [1, 10, 12]
to the function and it will produce a 1x512
tensor with 0.0
at those positions, and 1.0
in all the others.
EDIT:
As you mentioned you strictly need to reduce the size of the parameters in your model.
Each dense layer is described by the relation y = Wx B
, where W
is the kernel (or weights matrix) and B
is the bias vector. W
is a matrix of INPUTxOUTPUT
dimensions, where INPUT
is the last layer output shape and OUTPUT
is the number of neurons/units/weights in the layer; B
is just a vector of dimension 1xOUTPUT
(but we are not interested in this).
The problem now is that you are dropping N
neurons in the layer Ln
and this induce the drop of NxOUTPUT
weights in the layer Ln 1
. Let's be pratic with some numbers. In your case (supposing only_digits
as true) you start with:
Nx512 -> 512x10 (5120 weights)
And after dropping 100 neurons (it means a drop of 100*10=1000 weights)
Nx412 -> 412x10 (4120 weights)
Now each column of the W
matrix describe a neuron (as a vector of weights with a dimension equal to the previous layer output dimension, in our case 512 or 412). The rows of the matrix represent instead a single neuron in the previous layer.
The W[0,0]
indicates the relation between the first neuron of layer n
and the first of layer n 1
.
W[0,0] -> 1st n, 1st n 1
W[0,1] -> 2nd n, 1st n 1
W[1,0] -> 1st n, 2nd n 1
And so on. So you could just remove from this matrix all the rows that are related to the neuron indexes you removed: index 0 -> row 0
.
You can access the W
matrix as a tensor from the dense layer with dense.kernel