I am confused about the appropriate loss function to use as i am generating my dataset using image_dataset_from_directory.
Data Generator
Train
train_ds = tf.keras.utils.image_dataset_from_directory(
'/content/dataset/train',
validation_split=0.05,
subset="training",
seed=123,
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)
Validation
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
'/content/dataset/val', image_size=(IMAGE_SIZE, IMAGE_SIZE), batch_size=BATCH_SIZE
)
Model
rn50v2_model = Sequential()
pretrained_model = tf.keras.applications.ResNet50V2(
include_top=False,
weights="imagenet",
input_shape=(IMAGE_SIZE, IMAGE_SIZE, 3),
pooling='avg',
classes = 2
)
print(pretrained_model.summary())
rn50v2_model.add(pretrained_model)
rn50v2_model.add(Flatten())
rn50v2_model.add(Dense(512, activation='relu'))
rn50v2_model.add(Dense(2, activation='softmax'))
#print(rn50v2_model.summary())
rn50v2_model.compile(loss='sparse_categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
when i tested my model, i got my result one hot encoded like below:
array([[0.24823777, 0.7517622 ]], dtype=float32)
I would prefer to use categorical_crossentropy
but pls explain this behaviour i cant seem to find information on the official documentation
CodePudding user response:
Your result is not one hot encoded. It is the result of your output layer squished by the softmax function. This function transforms all values between 0 and 1 and these values sum to 1. You can apply np.argmax(predictions, axis=-1)
to these "probabilities" to get the corresponding class. To use categorical_crossentropy
, try changing your label_mode
to categorical
, which will automatically generated one-hot-encoded labels:
train_ds = tf.keras.utils.image_dataset_from_directory(
'/content/dataset/train',
validation_split=0.05,
subset="training",
seed=123,
label_mode = 'categorical',
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
'/content/dataset/val',
subset="validation",
seed=123,
label_mode = 'categorical',
image_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE)
You could also consider using binary_crossentropy
if you only have two classes. You would have to change your loss function and output layer:
rn50v2_model.add(Dense(1, activation='sigmoid'))