I have a 2d numpy array (49000 entries with 784 feature columns) with training data and corresponding label array (y_train) which consists of categorical values labelled from 1 to 10.
Numpy array details -
print(X_train.shape, "X_train.shape")
print(y_train.shape, "y_train.shape")
print(X_val.shape, "X_val.shape")
print(y_val.shape, "y_val.shape")
print(np.unique(y_train))
Output -
(49000, 784) X_train.shape
(49000,) y_train.shape
(1000, 784) X_val.shape
(1000,) y_val.shape
[0 1 2 3 4 5 6 7 8 9]
This is the code I am running -
y_train_one_hot = tf.keras.utils.to_categorical(y_train, num_classes=10)
y_val_one_hot = tf.keras.utils.to_categorical(y_val, num_classes=10)
y_test_one_hot = tf.keras.utils.to_categorical(y_test, num_classes=10)
dataset_train = tf.data.Dataset.from_tensor_slices((X_train, y_train_one_hot )).batch(32)
dataset_validate = tf.data.Dataset.from_tensor_slices((X_val, y_val_one_hot )).batch(32)
dataset_test = tf.data.Dataset.from_tensor_slices((X_test, y_test_one_hot )).batch(32)
model = tf.keras.Sequential([
tf.keras.layers.Dense(784, activation='relu'),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.SGD(),
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=[tf.keras.metrics.Accuracy()],
)
model.fit(dataset_train, epochs=10, validation_data=dataset_validate)
I get the following output
Epoch 1/10
1532/1532 [==============================] - 16s 10ms/step - loss: nan - accuracy: 0.0294 - val_loss: nan - val_accuracy: 0.0000e 00
Epoch 2/10
1532/1532 [==============================] - 12s 8ms/step - loss: nan - accuracy: 0.0000e 00 - val_loss: nan - val_accuracy: 0.0000e 00
Epoch 3/10
1532/1532 [==============================] - 14s 9ms/step - loss: nan - accuracy: 0.0000e 00 - val_loss: nan - val_accuracy: 0.0000e 00
Epoch 4/10
1532/1532 [==============================] - 11s 7ms/step - loss: nan - accuracy: 0.0000e 00 - val_loss: nan - val_accuracy: 0.0000e 00
Epoch 5/10
1532/1532 [==============================] - 13s 9ms/step - loss: nan - accuracy: 0.0000e 00 - val_loss: nan - val_accuracy: 0.0000e 00
Can anyone say what the problem is in my code? Please note that the y array has categorical labels so this is NOT a regression model.
CodePudding user response:
The error probably comes from the loss function tf.keras.losses.CategoricalCrossentropy
you are using. Try using the SparseCategoricalCrossentropy
loss function. As stated here:
Use the
CategoricalCrossentropy
loss function when there are two or more label classes. We expect labels to be provided in a one_hot representation.
CodePudding user response:
It is because you are using labels as integer (e.g. 0,1,2,..9) shape (batch_size, 1)
and output size of your model is (batch_size,10)
i.e. probability of each class.
Either you change your labels in one_hot
which will change your single label in a vector of size [num_classes]
. Syntax in given below.
tf.keras.utils.to_categorical(
y, num_classes=None, dtype='float32'
)
(in your case num_classes=10
)
or use SparseCategoricalCrossentropy
which requires integer labels as mentioned by @AloneTogether. Syntax as follows.
tf.keras.losses.SparseCategoricalCrossentropy(
from_logits=False, reduction=losses_utils.ReductionV2.AUTO,
name='sparse_categorical_crossentropy'
)