I am learning about recurrent neural networks and I found out CuDNNLSTM layer, which is much faster than usual LSTM. So, I have tried to fit a CuDNNLSTM model, but the only thing, which program display is "Epoch 1" and then nothing is happening and my kernel is dying (I am working in jupyter-notebook). In jupyer terminal I have find this:
2022-05-25 22:22:59.693801: I tensorflow/stream_executor/cuda/cuda_dnn.cc:384] Loaded cuDNN version 8100
2022-05-25 22:23:00.149065: E tensorflow/stream_executor/cuda/cuda_driver.cc:1018] failed to synchronize the stop event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure
2022-05-25 22:23:00.149218: E tensorflow/stream_executor/gpu/gpu_timer.cc:55] INTERNAL: Error destroying CUDA event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure
2022-05-25 22:23:00.150008: E tensorflow/stream_executor/gpu/gpu_timer.cc:60] INTERNAL: Error destroying CUDA event: CUDA_ERROR_LAUNCH_FAILED: unspecified launch failure
2022-05-25 22:23:00.150355: F tensorflow/stream_executor/cuda/cuda_dnn.cc:217] Check failed: status== CUDNN_STATUS_SUCCESS (7 vs. 0)Failed to set cuDNN stream.
I have installed tensorflow-gpu and compatible CuDNN and CUDA to my tensorflow version
tensorflow version: 2.9.0
CUDA version: 11.2
CuDNN version: 8.1
I have tried also same model, but with LSTM layers and that have worked, but still it is very slow, so I want to figure out how to use a CuDNNLSTM model.
My code:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, LSTM
from tensorflow.compat.v1.keras.layers import CuDNNLSTM
mnist = tf.keras.datasets.mnist
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train/255.0
X_test = X_test/255.0
model = Sequential()
model.add(CuDNNLSTM(128, input_shape=(X_train.shape[1:]), return_sequences=True))
model.add(Dropout(0.2))
model.add(CuDNNLSTM(128))
model.add(Dropout(0.2))
model.add(Dense(32, activation="relu"))
model.add(Dropout(0.2))
model.add(Dense(10, activation="softmax"))
opt = tf.keras.optimizers.Adam(learning_rate=1e-3, decay=1e-5)
model.compile(loss="sparse_categorical_crossentropy",
optimizer=opt,
metrics=["accuracy"])
model.fit(X_train, y_train, epochs=3, validation_data=(X_test, y_test))
If somebody had same problem or know how to fix that, I will be grateful for help. Thanks in an advance.
CodePudding user response:
Have you tried this with the tanh
activation function? From my understanding, it has to use that. Details below:
Long Short-Term Memory layer - Hochreiter 1997.
See the Keras RNN API guide
for details about the usage of RNN API.
Based on available runtime hardware and constraints, this layer
will choose different implementations (cuDNN-based or pure-TensorFlow)
to maximize the performance. If a GPU is available and all
the arguments to the layer meet the requirement of the cuDNN kernel
(see below for details), the layer will use a fast cuDNN implementation.
The requirements to use the cuDNN implementation are:
activation
==tanh
recurrent_activation
==sigmoid
recurrent_dropout
== 0unroll
isFalse
use_bias
isTrue
- Inputs, if use masking, are strictly right-padded.
- Eager execution is enabled in the outermost context.