Home > Software design >  How do a put a different classifier on top of BertForSequenceClassification?
How do a put a different classifier on top of BertForSequenceClassification?

Time:10-07

I have a huggingface model:

model_name = 'bert-base-uncased'
model = BertForSequenceClassification.from_pretrained(model_name, num_labels=1).to(device)

How can I change the default classifier head? Since it's only a single LinearClassifier. I found this issue in the huggingface github which said:

You can also replace self.classifier with your own model.

model = BertForSequenceClassification.from_pretrained("bert-base-multilingual-cased")
model.classifier = new_classifier

where new_classifier is any pytorch model that you want.

However, I can't figure out how the structure of the new_classifier should look like (in particular the inputs and outputs so it can handle batches).

CodePudding user response:

By looking at the source code of BertForSequenceClassification here, you can see that the classifier is simply a linear layer that project the bert output from hidden_size dimension to num_labels dimension. Suppose you want to change the linear classifier to a two layer MLP with Relu activation, you can do the following:

new_classifier = nn.Sequential(
      nn.Linear(config.hidden_size, config.hidden_size *2),
      nn.ReLU(),
      nn.Linear(config.hidden_size * 2, config.num_labels)
    )
model.classifier = new_classifier

The requirement of the structure of your new classifier is its input dimension and output dimension need to be config.hidden_size dimension and config.num_labels accordingly. The structure of the classifier doesn't rely on the batch size, and module like nn.Linear takes (*, H_dimension) dimension as input so you don't need to specify the batch size when creating the new classifier.

  • Related