I face a problem in merging two deep learning models. I'm trying to build two deep learning models for multi-class classification problems, but there is a problem with output layer.
Code:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras import layers
from tensorflow.keras.layers import Input, Dense, concatenate
from tensorflow.keras.models import Model
from tensorflow.keras.utils import plot_model
from sklearn.feature_extraction.text import TfidfVectorizer
# Model A
a_ip_img = Input(shape=(5238,1,1500), name="Input_a")
al_1 = Dense(64, activation = "relu",name ="a_layer_1")(a_ip_img)
al_2 = Dense(128, activation="relu",name ="a_layer_2")(al_1)
al_3 = Dense(64, activation="relu",name ="a_layer_3")(al_2)
al_4 = Dense(32, activation="softmax",name ="a_output_layer")(al_3)
#Model B
b_ip_img = Input(shape=(5238,1,1500), name="Input_b")
bl_1 = Dense(64, activation="relu",name ="b_layer_1")(b_ip_img)
bl_2 = Dense(32, activation = "softmax",name ="b_output_layer")(bl_1)
#Merging model A and B
a_b = concatenate([al_4,bl_2],name="concatenated_layer")
#Final Layer
output_layer = Dense(7, activation = "softmax", name = "output_layer")(a_b)
#Model Definition
merged = Model(inputs=[(a_ip_img,b_ip_img)],outputs=[output_layer], name = "merged_model")
#Model Details
merged.summary()
keras.utils.plot_model(merged, "output/architecture.png", show_shapes=True)
opt1 = keras.optimizers.Adam(learning_rate=0.001)
merged.compile(loss='categorical_crossentropy', optimizer=opt1,metrics=['accuracy'])
wordsfreq=TfidfVectorizer(max_features=1500)
X_train, X_test, y_train, y_test = train_test_split(TF_IDF_Words1, Y, test_size=0.33, random_state=42)
history=merged.fit([X_train,X_train],y_train,validation_data=(X_test,y_test),epochs=20,batch_size=128)
Error:
AssertionError: Could not compute output KerasTensor(type_spec=TensorSpec(shape=(None, 5238, 1, 7), dtype=tf.float32, name=None), name='output_layer/Softmax:0', description="created by layer 'output_layer'")
CodePudding user response:
The reason is that you have two inputs and probably you would be passing one input to the model, therefore the model is not able to compute the output at softmax layers, otherwise, your model is fine...
_input1 = tf.random.normal((1,51238,1,1500))
_input2 = tf.random.normal((1,51238,1,1500))
#Now, pass the both inputs to the model,
merged([_input1, _input2])
Output:
<tf.Tensor: shape=(1, 51238, 1, 7), dtype=float32, numpy=
array([[[[0.14262925, 0.13413729, 0.14718482, ..., 0.1514142 ,
0.13827263, 0.13065808]],
[[0.13692997, 0.13697329, 0.14530036, ..., 0.15573122,
0.13480991, 0.1428994 ]],
[[0.1391519 , 0.139768 , 0.14587076, ..., 0.14994502,
0.13614304, 0.14207283]],
...,