Simple tensorflow LSTM network get stucks on apple silicon-CodePudding

I am building a very simple LSTM network using the imdb dataset in Tensorflow, in particular running on a Apple Silicon chip (M1 max).

My code is the following:

import tensorflow as tf
def get_and_pad_imdb_dataset(num_words=10000, maxlen=None, index_from=2):
    from tensorflow.keras.datasets import imdb

    # Load the reviews
    (x_train, y_train), (x_test, y_test) = imdb.load_data(path='imdb.npz',
                                                          num_words=num_words,
                                                          skip_top=0,
                                                          maxlen=maxlen,
                                                          start_char=1,
                                                          oov_char=2,
                                                          index_from=index_from)

    x_train = tf.keras.preprocessing.sequence.pad_sequences(x_train,
                                                        maxlen=None,
                                                        padding='pre',
                                                        truncating='pre',
                                                        value=0)
    
    x_test = tf.keras.preprocessing.sequence.pad_sequences(x_test,
                                                           maxlen=None,
                                                           padding='pre',
                                                           truncating='pre',
                                                           value=0)
    return (x_train, y_train), (x_test, y_test)


def get_imdb_word_index(num_words=10000, index_from=2):
    imdb_word_index = tf.keras.datasets.imdb.get_word_index(
                                        path='imdb_word_index.json')
    imdb_word_index = {key: value   index_from for
                       key, value in imdb_word_index.items() if value <= num_words-index_from}
    return imdb_word_index

(x_train, y_train), (x_test, y_test) = get_and_pad_imdb_dataset(maxlen=25)

imdb_word_index = get_imdb_word_index()

max_index_value = max(imdb_word_index.values())

embedding_dim = 16

model = tf.keras.Sequential([
    tf.keras.layers.Embedding(input_dim = max_index_value 1, output_dim = embedding_dim, mask_zero = True),
    tf.keras.layers.LSTM(units = 16),
    tf.keras.layers.Dense(units = 1, activation = 'sigmoid')
])

model.compile(loss = 'binary_crossentropy', metrics = ['accuracy'], optimizer = 'adam')


history = model.fit(x_train, y_train, epochs=3, batch_size = 32)

The code works absolutely fine on Google Colab and I am quite sure it is correct. However, on my Apple Silicon chip it gets stucked at the very first epoch and it doesn't seem to progress at all.

This is all I get from the log:

Epoch 1/3
2022-02-15 22:10:34.093907: I tensorflow/core/grappler/optimizers/custom_graph_optimizer_registry.cc:112] Plugin optimizer for device_type GPU is enabled.

I have already used my Apple Silicon chip with other tensorflow models without any problems at all.

Is there a way to debug/see what is going on with tensorflow on my apple silicon during the fit call? Does anyone know which could be the problem? Does anyone can test this code on another M1 machine if possible?

CodePudding user response：

TensorFlow is not supported by M1 silicon.

It is possible to get it installed with some features working, but anything that uses any C under the hood won't work. There are some workarounds such as installing with MiniForge, but you're always going to have some issues.

CodePudding user response：

Updating tensorflow-macos to 2.8 and tensorflow-metal to 0.4 solved the problem.

It seems that it is a commong bug that affects previous tensorflow versions, in particular when working with text related layers/model. As an example, in my case the problem was the Embedding layer.

A similar issue has been opened here, fixed by updating tensorflow.