I was given an assignment to learn how to create a model that could classify images. After making one I was getting 100% accuracy so I decided to trim down my model layers until I would something worse. I removed normalization of my input data, Conv2D layers, MaxPooling2D layers and Dense Hidden layers.
I'm now down to what I think is the bare bones and I'm still getting 100% accuracy which I seriously doubt is accurate. Running manual spot checks on test data it seems to pass them but I'm confused as to why.
When running the only warning I get is 2021-12-21 14:05:19.952543: W tensorflow/python/util/util.cc:368] Sets are not currently considered sequences, but this may change in the future, so consider avoiding using them.
My Model
model = keras.Sequential()
model.add(keras.Input(shape=(IMG_WIDTH, IMG_HEIGHT, 3,)))
model.add(keras.layers.Flatten())
model.add(keras.layers.Dense(NUM_CATEGORIES, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
flatten (Flatten) (None, 2700) 0
dense (Dense) (None, 43) 116143
=================================================================
Total params: 116,143
Trainable params: 116,143
Non-trainable params: 0
_________________________________________________________________
None
model.fit(x_train, y_train, epochs=EPOCHS)
Epoch 1/10
500/500 [==============================] - 1s 2ms/step - loss: 32.7355 - accuracy: 0.8956
Epoch 2/10
500/500 [==============================] - 1s 2ms/step - loss: 3.2202e-07 - accuracy: 1.0000
Epoch 3/10
...
Epoch 9/10
500/500 [==============================] - 1s 2ms/step - loss: 2.1457e-08 - accuracy: 1.0000
Epoch 10/10
500/500 [==============================] - 1s 2ms/step - loss: 1.6482e-08 - accuracy: 1.0000
model.evaluate(x_test, y_test, verbose=2)
333/333 - 1s - loss: 1.3704e-08 - accuracy: 1.0000 - 510ms/epoch - 2ms/step
I manually looked at random samples of model input data, its output and the label associated and I couldn't see my label encoded into the input and the output does match my label.
i = x_test[10:11]
r = model.predict(x_test[10:11])
l = y_test[10:11]
print(i)
print(np.argmax(r), np.argmax(l))
Results
[[[[101 111 157]
[101 111 157]
[116 122 157]
...
[ 82 104 88]
[ 56 85 81]
[ 55 79 86]]
[[101 111 157]
[101 111 157]
[116 122 157]
...
[101 131 160]
[ 99 128 156]
[ 89 122 149]]]]
12 12
CodePudding user response:
When I get 100% accuracy, I think of an over fitting problem, but your model is so simple.
Getting 100% in training my be okay but on test set, I don't think so.
So some possible mistakes you maybe did:
- You may leak your data to the test set
- You are using a small dataset
- You're dataset is similar or you are having duplicates
- Check for data distribution, you maybe have a category that has most of the data, and the other ones has small data on them.
As a suggestion you may add another metric like precision or recall
into your metrics array and see the results.
Also check this similar post you may get some hints