Home > Software engineering >  ValueError: `logits` and `labels` must have the same shape, received ((None, 16) vs (None, 1))
ValueError: `logits` and `labels` must have the same shape, received ((None, 16) vs (None, 1))

Time:06-20

I found similar questions however the few that had accepted answers did not work for me. The following is my code for a binary classifier:

from google.colab import drive
drive.mount('/content/drive')

df = pd.read_csv('/content/drive/My Drive/dielectron.csv')
df = df.drop('Run', axis=1); df = df.drop('M', axis=1)
df.info()
df.head()

scaler = MinMaxScaler()

index = df.index.to_list() 
columns = df.columns.tolist()

scaler = MinMaxScaler()

df_scaled = scaler.fit_transform(df)
Df = pd.DataFrame(df_scaled , index=index , columns=columns)
Df.info()

Df= Df.drop('Event', axis=1)
x = Df.drop('Q2', axis=1).to_numpy()
y = Df['Q2']
y = np.asarray(y).astype('float32').reshape((-1,1))

model = tf.keras.Sequential([
                                   tf.keras.layers.Dense(16, activation='relu'),
                                   tf.keras.layers.Dense(16, activation='relu'),
                                   tf.keras.layers.Dense(16, activation='sigmoid')
])

epochs = 20


es = tf.keras.callbacks.EarlyStopping(monitor='val loss',
                                      patience = 3,
                                      mode = 'min',
                                      restore_best_weights=True)

model.compile(loss= tf.keras.losses.BinaryCrossentropy(),
              optimizer= tf.optimizers.Adam(),
              metrics= [tf.keras.metrics.BinaryAccuracy()]
)

history = model.fit(x, y, epochs=epochs, validation_split=0.3, callbacks=[es])

running x.shape and y.shape for the x and y that being fed into the mode.fit() returns these values:

x.shape:(100000, 15)
y.shape:(100000, 1)

Im sorry if there's any blatant mistakes, I'm relatively inexperienced with ML and DL and tf keras.

running this code returns the following error:

ValueError: in user code:

    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function  *
        return step_function(self, iterator)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1010, in step_function  **
        outputs = model.distribute_strategy.run(run_step, args=(data,))
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1000, in run_step  **
        outputs = model.train_step(data)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 860, in train_step
        loss = self.compute_loss(x, y, y_pred, sample_weight)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 919, in compute_loss
        y, y_pred, sample_weight, regularization_losses=self.losses)
    File "/usr/local/lib/python3.7/dist-packages/keras/engine/compile_utils.py", line 201, in __call__
        loss_value = loss_obj(y_t, y_p, sample_weight=sw)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 141, in __call__
        losses = call_fn(y_true, y_pred)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 245, in call  **
        return ag_fn(y_true, y_pred, **self._fn_kwargs)
    File "/usr/local/lib/python3.7/dist-packages/keras/losses.py", line 1932, in binary_crossentropy
        backend.binary_crossentropy(y_true, y_pred, from_logits=from_logits),
    File "/usr/local/lib/python3.7/dist-packages/keras/backend.py", line 5247, in binary_crossentropy
        return tf.nn.sigmoid_cross_entropy_with_logits(labels=target, logits=output)

    ValueError: `logits` and `labels` must have the same shape, received ((None, 16) vs (None, 1)).

The dataset that im using can be found at This website also can anyone explain to me what exactly a logit is? I was going off the context and guessing that its something to do with the features however looking it up yielded conflicting answers

CodePudding user response:


Logits and classification

For classification, you usually need to convert a vector of raw values to a probability distribution, i.e., to a vector whose elements are in [0,1] and sum up to 1. In this context, "logits" refer to the raw values before the conversion.

  • For classification between two classes (binary classification), the converted vector only needs one element to represent the probability p0 of the input belonging to class 0 as p1 is implicitly 1 - p0. Conversion from logits to distribution in this case is done using Sigmoid function, and the loss function is usually Binary Cross Entropy (BCE).
  • For classification between more than two classes, you will need to one-hot encode the distribution. That is, you'll want the number of elements in the converted vector to be the same as the number of classes. The n-th element then represents pn, i.e., probability of the input belonging to the n-th class. In this case, conversion from logits to distribution will be done using Softmax function, and the loss function is usually Categorical Cross Entropy (CCE).

Note that nothing prevents you from one-hot encoding a binary distribution, i.e., having a converted vector with two elements representing p0 and p1 separately. However, Tensorflow implementation of BCE loss assumes that the binary distribution is not one-hot encoded.


Answer

Since your dataset has y.shape:(100000, 1), it is a binary classification dataset. This requires the output of your network to be vectors of size 1 instead of 16.

Furthermore, if you use Tensorflow BCE loss function, you also have an option to specify (via the from_logits argument) whether the size-1 prediction vector fed to the function contains the raw logits or the distribution. When from_logits=True, the function will first apply sigmoid on the prediction vector, then calculate the usual BCE.

So, simply specify your model and loss function as (ignore the arrow marks)

model = tf.keras.Sequential([
                                   tf.keras.layers.Dense(16, activation='relu'),
                                   tf.keras.layers.Dense(16, activation='relu'),
                                   tf.keras.layers.Dense(1, activation='sigmoid') <---
])

model.compile(loss=tf.keras.losses.BinaryCrossentropy(),
              optimizer= tf.optimizers.Adam(),
              metrics=[tf.keras.metrics.BinaryAccuracy()]
)

or

model = tf.keras.Sequential([
                                   tf.keras.layers.Dense(16, activation='relu'),
                                   tf.keras.layers.Dense(16, activation='relu'),
                                   tf.keras.layers.Dense(1) <---
])

model.compile(loss=tf.keras.losses.BinaryCrossentropy(from_logits=True) <---,
              optimizer= tf.optimizers.Adam(),
              metrics=[tf.keras.metrics.BinaryAccuracy()]
)
  • Related