I'm trying to build a neural network that predicts the next number from a simple sequence of numbers, thus I'm taking my input of 3 and putting in a tf.data.Dataset
, now when I try to feed this to an LSTM layer I get the following error
ValueError: Input 0 of layer "lstm" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (None, 3)
After building the simplest tf.data.Dateset
I can image with just 4 samples I try to feed it into a an LSTM which has 64 hidden units
, and since the sequence is 3 steps, I'm shaping the input as (2, 3, 1)
(batch size=2, steps = 3, features =1), from the data I constructed every dataset tensor will be ((3,1),(1,))
and then when batches, the first layer should receive it's (2, 3, 1)
which is not happening?
But I cannot see why this would happen for such a simple setup:
import tensorflow as tf
from keras.layers import LSTM, Dense
inputs = [[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]]
outputs = [[4], [5], [6], [7]]
dataset = tf.data.Dataset.from_tensor_slices((inputs, outputs)).batch(2)
class Model(tf.keras.Model):
def __init__(self, input_size, hidden_size, num_classes, steps, batch_size):
super(Model, self, ).__init__()
self.lstm = LSTM(hidden_size, input_shape=(batch_size,steps, input_size))
self.fc = Dense(num_classes)
def call(self, input, training=False):
out = self.lstm(input)
out = self.fc(out)
return out
model = Model(input_size=1, hidden_size=64, num_classes=4, steps=3, batch_size=2)
model.build(input_shape=(2, 3, 1,))
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics='accuracy')
model.summary()
model.fit(dataset, epochs=2)
tf.data.Dataset
doesn't have any reshape function, so I cannot follow most of the other answers on SO, what is it missing to fit the input to the LSTM?
CodePudding user response:
Try adding the additional dimension to your data, since a LSTM
layer needs the input shape (timesteps, features)
without the batch size. This layer also requires floating data and not integers. Also, if you are use categorical_crossentropy
, your labels need to be one-hot encoded. Otherwise, use sparse_categorical_crossentropy
and make sure your labels begin at 0 and not 4:
import tensorflow as tf
from keras.layers import LSTM, Dense
inputs = tf.constant([[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]], dtype=tf.float32)
outputs = tf.constant([[0], [1], [2], [3]])
dataset = tf.data.Dataset.from_tensor_slices((inputs[..., None], outputs)).batch(2)
class Model(tf.keras.Model):
def __init__(self, input_size, hidden_size, num_classes, steps):
super(Model, self, ).__init__()
self.lstm = LSTM(hidden_size, input_shape=(steps, input_size))
self.fc = Dense(num_classes)
def call(self, input, training=False):
out = self.lstm(input)
out = self.fc(out)
return out
model = Model(input_size=1, hidden_size=64, num_classes=4, steps=3)
model.build(input_shape=(2, 3, 1,))
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics='accuracy')
model.summary()
model.fit(dataset, epochs=2)
If you want to use categorical_crossentropy
as your loss function, try changing your dataset like this:
dataset = tf.data.Dataset.from_tensor_slices((inputs[..., None], tf.keras.utils.to_categorical(outputs, 4))).batch(2)