ValueError: `logits` and `labels` must have the same shape, received ((None, 10) vs (None, 1))-CodePudding

I am running an Involution Model (based of this example), and I am constantly running into errors during the training stage. This is my error:

ValueError: `logits` and `labels` must have the same shape, received ((None, 10) vs (None, 1)).

Below is the relevant code for dataset loading:

    train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)
    test_datagen = ImageDataGenerator(rescale=1./255)
    train_ds = train_datagen.flow_from_directory(
        'data/train',
        target_size=(150, 150),
        batch_size=128,
        class_mode='binary')
    test_ds = test_datagen.flow_from_directory(
        'data/test',
        target_size=(150, 150),
        batch_size=64,
        class_mode='binary')`

And this is the code for training:

    print("building the involution model...")

    inputs = keras.Input(shape=(224, 224, 3))
    x, _ = Involution(channel=3, group_number=1, kernel_size=3, stride=1, reduction_ratio=2, name="inv_1")(inputs)
    x = keras.layers.ReLU()(x)
    x = keras.layers.MaxPooling2D((2, 2))(x)
    x, _ = Involution(
    channel=3, group_number=1, kernel_size=3, stride=1, reduction_ratio=2, name="inv_2")(x)
    x = keras.layers.ReLU()(x)
    x = keras.layers.MaxPooling2D((2, 2))(x)
    x, _ = Involution(
    channel=3, group_number=1, kernel_size=3, stride=1, reduction_ratio=2, name="inv_3")(x)
    x = keras.layers.ReLU()(x)
    x = keras.layers.Flatten()(x)
    x = keras.layers.Dense(64, activation="relu")(x)
    outputs = keras.layers.Dense(10)(x)

    inv_model = keras.Model(inputs=[inputs], outputs=[outputs], name="inv_model")

    print("compiling the involution model...")
    inv_model.compile(
    optimizer="adam",
    loss=keras.losses.BinaryCrossentropy(from_logits=True),
    metrics=["accuracy"],
    )

    print("inv model training...")
    inv_hist = inv_model.fit(train_ds, epochs=20, validation_data=test_ds)`

The model itself the same used by Keras, and I have not changed anything except to use my own dataset instead of the CIFAR dataset (model works for me with this dataset). So I am sure there is an error in my data loading, but I am unable to identify what that is.

Model Summary:

Model: "inv_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 input_14 (InputLayer)       [(None, 224, 224, 3)]     0         
                                                                 
 inv_1 (Involution)          ((None, 224, 224, 3),     26        
                              (None, 224, 224, 9, 1,             
                             1))                                 
                                                                 
 re_lu_39 (ReLU)             (None, 224, 224, 3)       0         
                                                                 
 max_pooling2d_26 (MaxPoolin  (None, 112, 112, 3)      0         
 g2D)                                                            
                                                                 
 inv_2 (Involution)          ((None, 112, 112, 3),     26        
                              (None, 112, 112, 9, 1,             
                             1))                                 
                                                                 
 re_lu_40 (ReLU)             (None, 112, 112, 3)       0         
                                                                 
 max_pooling2d_27 (MaxPoolin  (None, 56, 56, 3)        0         
 g2D)                                                            
                                                                 
 inv_3 (Involution)          ((None, 56, 56, 3),       26        
                              (None, 56, 56, 9, 1, 1)            
                             )                                   
                                                                 
 re_lu_41 (ReLU)             (None, 56, 56, 3)         0         
                                                                 
 flatten_15 (Flatten)        (None, 9408)              0         
                                                                 
 dense_26 (Dense)            (None, 64)                602176    
                                                                 
 dense_27 (Dense)            (None, 10)                650       
                                                                 
=================================================================

CodePudding user response：

When you called the train_datagen.flow_from_directory() function, you used class_mode='binary' which means you will have the labels of your images as 0 and 1 only, whereas you are have total 10 predictions i.e. 10 neurons in your final output layer. Hence the labels and logits dosen't match. Solution: Use class_mode='categorical' which means that there will be as many labels as the number of classes. Do the same in test_datagen as well.