I seem to be having problems when using tensorflow 2.5 on Google Colab. I assume there is some incompatibility between the CUDA version and/or CuDNN version. How would I fix them?
I checked the CUDA version used by colab. It is 11.2 which should be ok with tf2.5. That would mean that the problem is with CuDNN, right?
Code to reproduce:
!pip install tensorflow==2.5.0
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0
def my_model():
inputs = keras.Input(shape=(32, 32, 3))
x = layers.Conv2D(32, 3)(inputs)
x = layers.BatchNormalization()(x)
x = keras.activations.relu(x)
x = layers.MaxPooling2D()(x)
x = layers.Conv2D(64, 3)(x)
x = layers.BatchNormalization()(x)
x = keras.activations.relu(x)
x = layers.MaxPooling2D()(x)
x = layers.Conv2D(128, 3)(x)
x = layers.BatchNormalization()(x)
x = keras.activations.relu(x)
x = layers.Flatten()(x)
x = layers.Dense(64, activation="relu")(x)
outputs = layers.Dense(10)(x)
model = keras.Model(inputs=inputs, outputs=outputs)
return model
model = my_model()
model.compile(
loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
optimizer=keras.optimizers.Adam(learning_rate=3e-4),
metrics=["accuracy"],
)
model.fit(x_train, y_train, batch_size=64, epochs=10, verbose=2)
model.evaluate(x_test, y_test, batch_size=64, verbose=2)
I have tried this answer but I get the same error.
This answer also proposes I use tf.config.experimental.set_memory_growth(gpu, True)
but again - that does not work - I get the same error.
I am interested in using GPU. I know that everything works fine without hardware acceleration.
CodePudding user response:
In this documentation, Google warns us not to install/downgrade TensorFlow version using !pip
command.
They wrote:
Colab builds TensorFlow from source to ensure compatibility with our fleet of accelerators. Versions of TensorFlow fetched from PyPI by pip may suffer from performance problems or may not work at all.
Which means if we install any other TensorFlow, that version might not be compatible with their provided GPU/TPU configuration. So, just use TensorFlow 2.6 (which is the latest version) and it is so much similar to version 2.5.