I am using Ubuntu 20.04 on wsl2 running on win11. The code to execute is as follows:
import tensorflow as tf
from tensorflow import keras
from keras.layers.convolutional import Conv2D, MaxPooling2D
import numpy as np
(X_train, y_train), (X_test, y_test) = keras.datasets.mnist.load_data()
model = keras.Sequential([
keras.layers.Flatten(input_shape=(28, 28)),
keras.layers.Dense(2500, input_shape=(784,), activation='relu'),
keras.layers.Dense(2000, activation='relu'),
keras.layers.Dense(1500, activation='relu'),
keras.layers.Dense(1000, activation='relu'),
keras.layers.Dense(500, activation='relu'),
keras.layers.Dense(10, activation='sigmoid')
])
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, epochs=500)
If I run the code on CPU the output is as follows:
Epoch 205/500
1875/1875 [==============================] - 80s 43ms/step - loss: 0.1887 - accuracy: 0.9466
Epoch 206/500
1875/1875 [==============================] - 79s 42ms/step - loss: 0.3433 - accuracy: 0.9484
Epoch 207/500
1875/1875 [==============================] - 79s 42ms/step - loss: 0.1987 - accuracy: 0.9690
Epoch 208/500
1875/1875 [==============================] - 80s 43ms/step - loss: 0.2632 - accuracy: 0.9582
But if I run the same code over a docker(tensorflow/tensorflow:latest-gpu-py3-jupyter) the output is as follows:
Epoch 205/500
60000/60000 [==============================] - 45s 752us/sample - loss: 9.5371 - accuracy: 0.0987
Epoch 206/500
60000/60000 [==============================] - 45s 749us/sample - loss: 9.5371 - accuracy: 0.0987
Epoch 207/500
60000/60000 [==============================] - 45s 749us/sample - loss: 9.5371 - accuracy: 0.0987
Epoch 208/500
60000/60000 [==============================] - 45s 745us/sample - loss: 9.5371 - accuracy: 0.0987
The accuracy is constant.
The installation was made based on:
In the installation process I did not get any error.
Another thing is that in the compilation of the model on gpu the time used is huge(more than 5 minutes).
Thanks in advance for any help/idea.
CodePudding user response:
You should decrease number of neurons, layers and use softmax in Dense(10)
, as you have multiclass output.