The keys of the BERT encoder's output are default
, encoder_outputs
, pooled_output
and sequence_output
As far as I can know, encoder_outputs
are the output of each encoder, pooled_output
is the output of the global context and sequence_output
is the output context of each token (correct me if I'm wrong please). But what is default
? Can you give me a more detailed explanation of each one?
This is the link to the encoder
CodePudding user response:
The Tensorflow docs provide a very good explanation to the outputs you are asking about:
The BERT models return a map with 3 important keys: pooled_output, sequence_output, encoder_outputs:
pooled_output represents each input sequence as a whole. The shape is [batch_size, H]. You can think of this as an embedding for the entire movie review.
sequence_output represents each input token in the context. The shape is [batch_size, seq_length, H]. You can think of this as a contextual embedding for every token in the movie review.
encoder_outputs are the intermediate activations of the L Transformer blocks. outputs["encoder_outputs"][i] is a Tensor of shape [batch_size, seq_length, 1024] with the outputs of the i-th Transformer block, for 0 <= i < L. The last value of the list is equal to sequence_output
Here is another interesting discussion on the difference between the pooled_output
and sequence_output
, if you are interested.
The default
output is equal to the pooled_output
, which you can verify here:
import tensorflow as tf
import tensorflow_hub as hub
tfhub_handle_preprocess = 'https://tfhub.dev/tensorflow/bert_en_uncased_preprocess/3'
tfhub_handle_encoder = 'https://tfhub.dev/tensorflow/small_bert/bert_en_uncased_L-4_H-512_A-8/1'
def build_classifier_model(name):
text_input = tf.keras.layers.Input(shape=(), dtype=tf.string, name='features')
bert_preprocess_model = hub.KerasLayer(tfhub_handle_preprocess, name='preprocessing')
encoder_inputs = bert_preprocess_model(text_input)
encoder = hub.KerasLayer(tfhub_handle_encoder)
outputs = encoder(encoder_inputs)
net = outputs[name]
return tf.keras.Model(text_input, net)
sentence = tf.constant([
"Improve the physical fitness of your goldfish by getting him a bicycle"
])
classifier_model = build_classifier_model(name='default')
default_output = classifier_model(sentence)
classifier_model = build_classifier_model(name='pooled_output')
pooled_output = classifier_model(sentence)
print(default_output == pooled_output)