Logits and labels must have the same first dimension error, despite using sparse catogorical entropy-CodePudding

These are the shapes of my features and target variables.

(1382, 1785, 2) (1382, 2)

The target here has two labels and each label has the same 28 classes. I have a CNN network as follows:-

model.add(Conv1D(100,5, activation='relu', input_shape=(1785,2)))
model.add(MaxPooling1D(pool_size=5))
model.add(Conv1D(64,10, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(28, activation='softmax'))

When I use one hot encoded targets (1382,28) and categorical crossentropy loss function, the model runs fine and gives no errors.

But when I use sparse targets (1382,2) and sparse categorical crossentropy loss function, I run into the following error.

logits and labels must have the same first dimension, got logits shape [20,28] and labels shape [40]
 [[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at \AppData\Local\Temp/ipykernel_9932/3729291395.py:1) ]] [Op:__inference_train_function_11741]

From what I have seen from the people who have posted the same problem, seem to be using sparse categorical crossentropy for one hot encoded target variables.

I think that there is some problem with the shapes of the batches maybe. The shape of the logit changes to [x,28] where x is the batch size. Another thing that could be a problem is that I have two labels, but have no leads on how to troubleshoot the problem from there.

Any help is highly appreciated.

CodePudding user response：

If you are using SparseCategoricalCrossEntropy as your loss function, you need to make sure that each data sample in your data belongs to one class ranging from 0 to 27. For example:

samples = 25
labels = tf.random.uniform((25, ), maxval=28, dtype=tf.int32)
print(labels)

tf.Tensor(
[12  7  1 13 22 14 26 13  6  1 27  1 11 18  5 18  5  6 12 14 21 18 17 12
  5], shape=(25,), dtype=int32)

Consider the shape of labels, it is neither (25, 2) nor (25, 28), but rather (25,) which will work with SparseCategoricalCrossEntropy.