I have a huggingface model:
model_name = 'bert-base-uncased'
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=1).to(device)
How can I change the default classifier head? Since it's only a single LinearClassifier. I found this issue in the huggingface github which said:
You can also replace self.classifier with your own model.
model = BertForSequenceClassification.from_pretrained("bert-base-multilingual-cased") model.classifier = new_classifier
where new_classifier is any pytorch model that you want.
However, I can't figure out how the structure of the new_classifier
should look like (in particular the inputs and outputs so it can handle batches).
CodePudding user response:
By looking at the source code of BertForSequenceClassification here, you can see that the classifier is simply a linear layer that project the bert output from hidden_size dimension to num_labels dimension. Suppose you want to change the linear classifier to a two layer MLP with Relu activation, you can do the following:
new_classifier = nn.Sequential(
nn.Linear(config.hidden_size, config.hidden_size *2),
nn.ReLU(),
nn.Linear(config.hidden_size * 2, config.num_labels)
)
model.classifier = new_classifier
The requirement of the structure of your new classifier is its input dimension and output dimension need to be config.hidden_size dimension and config.num_labels accordingly. The structure of the classifier doesn't rely on the batch size, and module like nn.Linear takes (*, H_dimension) dimension as input so you don't need to specify the batch size when creating the new classifier.