I am using tensorflow 2.8. I followed a tutorial from tensorflow on how to create your own fit function by overwriting the train_step
function in your custom keras model class.
I wanted to add class_weight
but in their section "Supporting sample_weight & class_weight" they don't show how to actually use class_weight
, only sample_weight
.
Is there a way to use class_weight
in a custom train_step
function?
I also found this Colab notebook in a GitHub issue. However this creates a custom model class but doesn't even use it and is therefore of no help either.
When actually creating the custom model and calling fit()
I get the error: TypeError: __call__() got an unexpected keyword argument 'class_weight'
, when the loss in train_step()
is calculated.
Example code (with the error) of what I'm trying to do:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
# Prepare the training dataset.
batch_size = 64
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
x_train = np.reshape(x_train, (-1, 784))
x_test = np.reshape(x_test, (-1, 784))
# Reserve 10,000 samples for validation.
x_val = x_train[-10000:]
y_val = y_train[-10000:]
x_train = x_train[:-10000]
y_train = y_train[:-10000]
# Prepare the training dataset.
train_dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(batch_size)
# Get model
inputs = keras.Input(shape=(784,), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(10, name="predictions")(x)
# Instantiate an optimizer to train the model.
optimizer = keras.optimizers.SGD(learning_rate=1e-3)
# Instantiate a loss function.
loss_fn = keras.losses.SparseCategoricalCrossentropy(from_logits=True)
# Prepare the metrics.
train_acc_metric = keras.metrics.SparseCategoricalAccuracy()
class CustomModel(keras.Model):
def train_step(self, data):
# Unpack the data. Its structure depends on your model and
# on what you pass to `fit()`.
if len(data) == 3:
x, y, class_weight = data
else:
x, y = data
with tf.GradientTape() as tape:
logits = self(x, training=True) # Forward pass
# Compute the loss value.
# The loss function is configured in `compile()`.
loss = self.compiled_loss(
y,
logits,
class_weight=class_weight,
regularization_losses=self.losses,
)
# Compute gradients
trainable_vars = self.trainable_variables
gradients = tape.gradient(loss, trainable_vars)
# Update weights
self.optimizer.apply_gradients(zip(gradients, trainable_vars))
# Update the metrics.
# Metrics are configured in `compile()`.
self.compiled_metrics.update_state(y, y_pred, class_weight=class_weight)
# Return a dict mapping metric names to current value.
return {m.name: m.result() for m in self.metrics}
# Construct and compile an instance of CustomModel
model = CustomModel(inputs=inputs, outputs=outputs)
model.compile(optimizer=optimizer, loss=loss_fn, metrics=["accuracy"])
class_weight = {
0: 1.0,
1: 1.0,
2: 1.0,
3: 1.0,
4: 1.0,
# Set weight "2" for class "5",
# making this class 2x more important
5: 2.0,
6: 1.0,
7: 1.0,
8: 1.0,
9: 1.0,
}
model.fit(train_dataset, class_weight=class_weight, epochs=3)
All those examples do not work because loss functions don't take class_weight
as an argument, as far as I understand when looking at the documentation.
I tried to fix this by creating my own loss function:
@tf.function
def weighted_sparse_categorical_crossentropy(labels, logits, class_weight=None):
loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)(labels, logits)
if class_weight is None:
if debug:
print("None weights: NO WEIGHTS SET!")
return loss
# get all class weights as list
if type(class_weight) is not list:
class_weight = list(class_weight.values())
class_weights_gathered = tf.gather(class_weight, labels)
return tf.reduce_mean(class_weights_gathered * loss)
Then using this to compile my model and calling .fit()
model.compile(optimizer=optimizer, loss=weighted_sparse_categorical_crossentropy)
model.fit(X, class_weight=class_weight, epochs=1)
But I still get TypeError: __call__() got an unexpected keyword argument 'class_weight'
despite class_weight
being clearly an argument in my function.
I also looked at GitHub to see what Tensorflow actually does with the class_weight
inside the .fit()
function and seems to convert it to a sample_weight
somehow.
So I'm not sure if what I want is even possible. But then the section in the official tensorflow tutorial would be wrong since there would be no support for class_weight
.
CodePudding user response:
Pretty sure the error is here:
loss = self.compiled_loss(
y,
logits,
class_weight=class_weight,
regularization_losses=self.losses,
)
because class_weight
is not recognized... instead, you should use sample_weight
, and from keras documentation seems that is as simple as expanding the incoming data, as you are doing:
def train_step(self, data):
x, y, sample_weight = data
your doubt about the "missing example", is due to the fact that keras will automatically transform your class_weight
to sample_weight
, as you can see from there code:
class M(K.Model):
def train_step(self, data):
tf.print(data)
return {"loss":0}
model = M()
model.compile(K.optimizers.Adam(), K.losses.SparseCategoricalCrossentropy(), run_eagerly=True)
model.fit(np.array([[-1], [-1], [-1]]), np.array([[0],[0],[1]]), class_weight={0:7, 1:9})
which prints:
([[-1][-1][-1]],
[[0] [1] [0]],
[7 9 7])
where you can clearly see that the first line are the x
, the second line are the y
, the third one are the associated class weights to y
, which you can feed to a loss as usual:
x, y, sample_weight = data
...
loss = self.compiled_loss(
y,
logits,
sample_weight=sample_weight,
regularization_losses=self.losses,
)