Parameters for training a sentence-similarity model using Bert?-CodePudding

I have a list of sentences :

sentences = ["Missing Plate", "Plate not found"]

I am trying to find the most similar sentences in the list by using Transformers model with Huggingface embedding. I am able to find the similar sentences but the model is still not able to identify the difference between :

"Message ID exists"  
"Message ID doesn't exist"

[Note: I am trying to find the similarity by using the Cosine similarity from pytorch]

Can you suggest me ways to hyperparameter tune my model so that the model can weigh in more on the negative words and consider them opposite?

I found the list of parameters that can be tuned but not sure what the best parameters would be

Thanks!

CodePudding user response：

One major short-coming of BERT and other transformer variants is their inability to handle negation. Here is an excerpt from the paper What BERT Is Not: Lessons from a New Suite of Psycholinguistic Diagnostics for Language Models:

Most strikingly, however, we find that BERT fails completely to show generalizable understanding of negation, raising questions about the aptitude of LMs to learn this type of meaning.

This phenomena can easily be observed by interacting with the BERT inference API on the hugging face model hub. You can experiment with the following sentences:

A hammer is an [MASK].
A hammer is not an [MASK].

For both the cases, the highest scoring token is object, despite the explicit negation in the second case.

Therefore, I don't believe that further fine-tuning or hyper-parameter tuning will help in this case.

CodePudding user response：

I recommend to use sentence transformers which uses bert model:
first install the package:
pip install sentence-transformers

then

from sentence_transformers import SentenceTransformer

sentences = ["This is an example sentence", "Each sentence is converted"]
model = SentenceTransformer("sentence-transformers/all-MiniLM-L6-v2")
embeddings = model.encode(sentences)

Now you can compare embeddings with cosine similarity. For more information about this model visit this link