These are the shapes of my features and target variables.
(1382, 1785, 2) (1382, 2)
The target here has two labels and each label has the same 28 classes. I have a CNN network as follows:-
model.add(Conv1D(100,5, activation='relu', input_shape=(1785,2)))
model.add(MaxPooling1D(pool_size=5))
model.add(Conv1D(64,10, activation='relu'))
model.add(MaxPooling1D(pool_size=4))
model.add(Flatten())
model.add(Dense(512, activation='relu'))
model.add(Dense(256, activation='relu'))
model.add(Dense(28, activation='softmax'))
When I use one hot encoded targets (1382,28) and categorical crossentropy loss function, the model runs fine and gives no errors.
But when I use sparse targets (1382,2) and sparse categorical crossentropy loss function, I run into the following error.
logits and labels must have the same first dimension, got logits shape [20,28] and labels shape [40]
[[node sparse_categorical_crossentropy/SparseSoftmaxCrossEntropyWithLogits/SparseSoftmaxCrossEntropyWithLogits (defined at \AppData\Local\Temp/ipykernel_9932/3729291395.py:1) ]] [Op:__inference_train_function_11741]
From what I have seen from the people who have posted the same problem, seem to be using sparse categorical crossentropy for one hot encoded target variables.
I think that there is some problem with the shapes of the batches maybe. The shape of the logit changes to [x,28] where x is the batch size. Another thing that could be a problem is that I have two labels, but have no leads on how to troubleshoot the problem from there.
Any help is highly appreciated.
CodePudding user response:
If you are using SparseCategoricalCrossEntropy
as your loss function, you need to make sure that each data sample in your data belongs to one class ranging from 0 to 27. For example:
samples = 25
labels = tf.random.uniform((25, ), maxval=28, dtype=tf.int32)
print(labels)
tf.Tensor(
[12 7 1 13 22 14 26 13 6 1 27 1 11 18 5 18 5 6 12 14 21 18 17 12
5], shape=(25,), dtype=int32)
Consider the shape of labels
, it is neither (25, 2)
nor (25, 28)
, but rather (25,)
which will work with SparseCategoricalCrossEntropy
.