I am trying to use Tensorflow and Keras for a prediction model.
I first read in my dataset that has shape (7709, 58), then normalize it:
normalizer = tf.keras.layers.Normalization(axis=-1)
normalizer.adapt(np.array(dataset))
Then I split the data into training and testing data:
train_dataset = dataset[:5000]
test_dataset = dataset[5000:]
I prepare those datasets:
train_dataset.describe().transpose()
test_dataset.describe().transpose()
train_features = train_dataset.copy()
test_features = test_dataset.copy()
train_labels = train_features.pop('outcome')
test_labels = test_features.pop('outcome')
Then I build the model:
def build_and_compile_model(norm):
model = keras.Sequential([
norm,
layers.Dense(64, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(1)
])
model.compile(loss='mean_squared_error', metrics=['mean_squared_error'],
optimizer=tf.keras.optimizers.Adam(0.001))
return model
dnn_model = build_and_compile_model(normalizer)
When I then try to fit the model, it fails:
history = dnn_model.fit(
test_features,
test_labels,
validation_split=0.2, epochs=50)
Gives the following error:
ValueError: in user code:
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1010, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1000, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 859, in train_step
y_pred = self(x, training=True)
File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "normalization_7" (type Normalization).
Dimensions must be equal, but are 57 and 58 for '{{node sequential_7/normalization_7/sub}} = Sub[T=DT_FLOAT](sequential_7/Cast, sequential_7/normalization_7/sub/y)' with input shapes: [?,57], [1,58].
Does anyone know what the issue is and how I can address it? Thanks!
CodePudding user response:
Wild guess but maybe replace:
train_features = train_dataset.copy()
With:
from copy import deepcopy
train_features = deepcopy(train_dataset)
CodePudding user response:
You lost the outcome
column in the dataframe because of pop
. Try extracting that column using
train_labels = train_features['outcome']
test_labels = test_features['outcome']