Home > Mobile >  ValueError: No gradients provided for any variable: ['tf_deberta_v2_for_sequence_classification
ValueError: No gradients provided for any variable: ['tf_deberta_v2_for_sequence_classification

Time:08-15

I am trying to fine tune a transformer model for text classification but I am having trouble training the model. I have tried many things but none of them seem to work. I have also tried different solutions on other question but they didn't work. I am using 'microsoft/deberta-v3-base' model for fine tuning. Here's my code:

train_dataset = Dataset.from_pandas(df_tr[['text', 'label']]).class_encode_column("label")
val_dataset = Dataset.from_pandas(df_tes[['text', 'label']]).class_encode_column("label")

train_tok_dataset = train_dataset.map(tokenizer_func, batched=True, remove_columns=('text'))
val_tok_dataset = val_dataset.map(tokenizer_func, batched=True, remove_columns=('text'))
from transformers import TFAutoModelForSequenceClassification

model = TFAutoModelForSequenceClassification.from_pretrained(config.model_name, num_labels=3)
transformer_model = TFAutoModelForSequenceClassification.from_pretrained(config.model_name, output_hidden_states=True)

input_ids = tf.keras.Input(shape=(config.max_len, ),dtype='int32')
attention_mask = tf.keras.Input(shape=(config.max_len, ), dtype='int32')

transformer = transformer_model([input_ids, attention_mask])    
hidden_states = transformer[1] # get output_hidden_states
#print(hidden_states)
hidden_states_size = 4 # count of the last states 
hiddes_states_ind = list(range(-hidden_states_size, 0, 1))

selected_hiddes_states = tf.keras.layers.concatenate(tuple([hidden_states[i] for i in hiddes_states_ind]))

# Now we can use selected_hiddes_states as we want
output = tf.keras.layers.Dense(128, activation='relu')(selected_hiddes_states)
output=tf.keras.layers.Flatten()(output)
output = tf.keras.layers.Dense(3, activation='softmax')(output)
model = tf.keras.models.Model(inputs = [input_ids, attention_mask], outputs = output)
from transformers import create_optimizer
import tensorflow as tf

batch_size = 8
num_epochs = config.epochs
#batches_per_epoch = len(tokenized_tweets["train"]) // batch_size
total_train_steps = int(num_steps * num_epochs)
optimizer, schedule = create_optimizer(init_lr=2e-5, num_warmup_steps=0, num_train_steps=num_steps/2)

model.compile(optimizer=optimizer)

with tf.device('GPU:0'):
    model.fit(x=[np.array(train_tok_dataset["input_ids"]),np.array(train_tok_dataset["attention_mask"])],
y=tf.keras.utils.to_categorical(y_train,num_classes=3),
validation_data=([np.array(val_tok_dataset["input_ids"]),np.array(val_tok_dataset["attention_mask"])],tf.keras.utils.to_categorical(y_test,num_classes=3)),
epochs=config.epochs,class_weight={0:0.57,1:0.18,2:0.39})

It seems like a small issue, but I am new to tensorflow and transformers so I couldn't sort it out myself.

CodePudding user response:

I would say it's probably due to the fact that you are not adding a loss to the compilation, thus no gradient can be computed wrt it:

model.compile(optimizer=optimizer)
              ^^^^^^^^^^^^^^^^^^^^---- no "loss = tf.keras.losses...

CodePudding user response:

Maybe you're just missing an = on the right side of validation_data.

model.fit(
    x=[np.array(...),np.array(...)], 
    y=tf.keras.utils.to_categorical(...),
    validation_data=([np.array(...), np.array(...)], tf.keras.utils.to_categorical(...)),
    ...
)
  • Related