Apologies for this newbie question, I'm trying to train a regression model with Keras, but I get an error in model.fit()
.
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
inputs = keras.Input(shape=(6,5), name="digits")
x = layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = layers.Dense(1, activation="softmax", name="predictions")(x)
model = keras.Model(inputs=inputs, outputs=outputs)
x_train = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
y_train = np.array([1, 2, 3, 1, 2, 3])
model.compile(loss=keras.losses.SparseCategoricalCrossentropy())
history = model.fit(x_train,y_train)
This is the error, what does it mean and how to fix this? I'm using TensorFlow 2.7.0.
Input 0 of layer "model" is incompatible with the layer: expected shape=(None, 6, 5), found shape=(None, 5)
CodePudding user response:
To fix the error, you need to be completely clear about the input shape and output shape of the data. Inferring from your codes, there are 3 data points where you want to map [0,1,2,3,4]
to 1
, [5,6,7,8,9]
to 2
and [10,11,12,13,14]
to 3
.
Therefore, the input shape is (5,)
and the output shape is (1,)
,i.e., (5,)
should be used in tf.keras.Input
and y_train
needs to be reshaped into (6,1)
.
Moreover, as you want to do regression, an appropriate activation function of the output layer and loss function should be used. (See example below)
Finally, adjust the optimizer type, learning rate and other hyperparameters for better performance.
Demonstration:
inputs = tf.keras.Input(shape=(5,), name="digits")#input shape is (5,)
x = tf.keras.layers.Dense(64, activation="relu", name="dense_1")(inputs)
x = tf.keras.layers.Dense(64, activation="relu", name="dense_2")(x)
outputs = tf.keras.layers.Dense(1, name="predictions")(x)#use linear activation
model = tf.keras.Model(inputs, outputs)
x_train = np.array([[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14],
[ 0, 1, 2, 3, 4],
[ 5, 6, 7, 8, 9],
[10, 11, 12, 13, 14]])
y_train = np.array([1, 2, 3, 1, 2, 3])[:,None]#reshape
model.compile(optimizer=tf.keras.optimizers.SGD(learning_rate=0.001,momentum=0.99)
,loss=tf.keras.losses.MeanSquaredError())#use MSE
model.fit(x_train,y_train,epochs=500,verbose=0)
print(model.predict(x_train))
'''
outputs:
[[1.0019126]
[2.010047 ]
[3.0027502]
[1.0019126]
[2.010047 ]
[3.0027502]]
'''