I have been battling with my own implementation on my dataset with a different transformer model than the tutorial, and I have been getting this error AttributeError: 'NoneType' object has no attribute 'dtype'
, when i was starting to train my model. I have been trying to debug for hours, and then I have tried the tutorial from hugging face as it can be found here https://huggingface.co/transformers/v3.2.0/custom_datasets.html. Running this exact code, so I could identify my mistake, also leads to the same error.
!wget http://ai.stanford.edu/~amaas/data/sentiment/aclImdb_v1.tar.gz
!tar -xf aclImdb_v1.tar.gz
from pathlib import Path
def read_imdb_split(split_dir):
split_dir = Path(split_dir)
texts = []
labels = []
for label_dir in ["pos", "neg"]:
for text_file in (split_dir/label_dir).iterdir():
texts.append(text_file.read_text())
labels.append(0 if label_dir is "neg" else 1)
return texts, labels
train_texts, train_labels = read_imdb_split('aclImdb/train')
test_texts, test_labels = read_imdb_split('aclImdb/test')
from sklearn.model_selection import train_test_split
train_texts, val_texts, train_labels, val_labels = train_test_split(train_texts, train_labels, test_size=.2)
from transformers import DistilBertTokenizerFast
tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')
train_encodings = tokenizer(train_texts, truncation=True, padding=True)
val_encodings = tokenizer(val_texts, truncation=True, padding=True)
test_encodings = tokenizer(test_texts, truncation=True, padding=True)
import tensorflow as tf
train_dataset = tf.data.Dataset.from_tensor_slices((
dict(train_encodings),
train_labels
))
val_dataset = tf.data.Dataset.from_tensor_slices((
dict(val_encodings),
val_labels
))
test_dataset = tf.data.Dataset.from_tensor_slices((
dict(test_encodings),
test_labels
))
from transformers import TFDistilBertForSequenceClassification
model = TFDistilBertForSequenceClassification.from_pretrained('distilbert-base-uncased')
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
model.compile(optimizer=optimizer, loss=model.compute_loss) # can also use any keras loss fn
model.fit(train_dataset.shuffle(1000).batch(16), epochs=3, batch_size=16)
My goal will be to perform multi-label text classification on my own custom dataset, which unfortunately I cannot share for privacy reasons. If anyone could point out what is wrong with this implementation, will be highly appreciated.
CodePudding user response:
There seems to be an error, when you are passing the loss parameter.
model.compile(optimizer=optimizer, loss=model.compute_loss) # can also use any keras loss fn
You don't need to pass the loss parameter, if you want to use the model's built-in loss function.
I was able to train the model with your provided source code by changing mentioned line to:
model.compile(optimizer=optimizer)
or by passing a loss function
loss_fn = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
model.compile(optimizer=optimizer, loss=loss_fn)
transformers version: 4.20.1
Hope it helps.