Why does my model give 100% accuracy with only one layer of softmax?-CodePudding

I was given an assignment to learn how to create a model that could classify images. After making one I was getting 100% accuracy so I decided to trim down my model layers until I would something worse. I removed normalization of my input data, Conv2D layers, MaxPooling2D layers and Dense Hidden layers.

I'm now down to what I think is the bare bones and I'm still getting 100% accuracy which I seriously doubt is accurate. Running manual spot checks on test data it seems to pass them but I'm confused as to why.

When running the only warning I get is 2021-12-21 14:05:19.952543: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.

My Model

model = keras.Sequential()
model.add(keras.Input(shape=(IMG_WIDTH, IMG_HEIGHT, 3,)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(NUM_CATEGORIES, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])

model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
=================================================================
 flatten (Flatten)           (None, 2700)              0         
                                                                 
 dense (Dense)               (None, 43)                116143    
                                                                 
=================================================================
Total params: 116,143
Trainable params: 116,143
Non-trainable params: 0
_________________________________________________________________
None

model.fit(x_train, y_train, epochs=EPOCHS)

Epoch 1/10
500/500 [==============================] - 1s 2ms/step - loss: 32.7355 - accuracy: 0.8956
Epoch 2/10
500/500 [==============================] - 1s 2ms/step - loss: 3.2202e-07 - accuracy: 1.0000
Epoch 3/10
...
Epoch 9/10
500/500 [==============================] - 1s 2ms/step - loss: 2.1457e-08 - accuracy: 1.0000
Epoch 10/10
500/500 [==============================] - 1s 2ms/step - loss: 1.6482e-08 - accuracy: 1.0000

model.evaluate(x_test, y_test, verbose=2)

333/333 - 1s - loss: 1.3704e-08 - accuracy: 1.0000 - 510ms/epoch - 2ms/step

I manually looked at random samples of model input data, its output and the label associated and I couldn't see my label encoded into the input and the output does match my label.

i = x_test[10:11]
r = model.predict(x_test[10:11])
l = y_test[10:11]
print(i)
print(np.argmax(r), np.argmax(l))

Results

[[[[101 111 157]
   [101 111 157]
   [116 122 157]
   ...
   [ 82 104  88]
   [ 56  85  81]
   [ 55  79  86]]
  [[101 111 157]
   [101 111 157]
   [116 122 157]
   ...
   [101 131 160]
   [ 99 128 156]
   [ 89 122 149]]]]
12 12

CodePudding user response：

When I get 100% accuracy, I think of an over fitting problem, but your model is so simple.

Getting 100% in training my be okay but on test set, I don't think so.

So some possible mistakes you maybe did:

You may leak your data to the test set
You are using a small dataset
You're dataset is similar or you are having duplicates
Check for data distribution, you maybe have a category that has most of the data, and the other ones has small data on them.

As a suggestion you may add another metric like precision or recall into your metrics array and see the results.

Also check this similar post you may get some hints