movie similarity using Word2Vec and deep Convolutional Autoencoders-CodePudding

i am new to python and i am trying to create a model that can measure how similar movies are based on the movies description,the steps i followed so far are:

1.turn each movie description into a vector of 100*(maximum number of words possible for a movie description) values using Word2Vec, this results in a 21300-values vector for each movie description. 2.create a deep convolutional autoencoder that tries to compress each vector(and hopefully extract meaning from it).

while the first step was successful and i am still struggling with the autoencoder, here is my code so far:

encoder_input = keras.Input(shape=(21300,), name='sum')
encoded= tf.keras.layers.Reshape((150,142,1),input_shape=(21300,))(encoder_input)
x = tf.keras.layers.Conv2D(128, (3, 3), activation="relu", padding="same",input_shape=(1,128,150,142))(encoded)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)
x = tf.keras.layers.Conv2D(64, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)#49*25*64
x = tf.keras.layers.Conv2D(32, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)#25*13*32
x = tf.keras.layers.Conv2D(16, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)
x = tf.keras.layers.Conv2D(8, (3, 3), activation="relu", padding="same")(x)
x = tf.keras.layers.MaxPooling2D((2, 2), padding="same")(x)
x=tf.keras.layers.Flatten()(x)
encoder_output=keras.layers.Dense(units=90, activation='relu',name='encoder')(x)
x= tf.keras.layers.Reshape((10,9,1),input_shape=(28,))(encoder_output)

# Decoder

decoder_input=tf.keras.layers.Conv2D(8, (3, 3), activation='relu', padding='same')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(decoder_input)
x = tf.keras.layers.Conv2D(16, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
x = tf.keras.layers.Conv2D(32, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
x = tf.keras.layers.Conv2D(64, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
x = tf.keras.layers.Conv2D(128, (3, 3), activation='relu')(x)
x = tf.keras.layers.UpSampling2D((2, 2))(x)
decoder_output = keras.layers.Conv2D(1, (3, 3), activation='relu', padding='same')(x)

autoencoder = keras.Model(encoder_input, decoder_output)
opt = tf.keras.optimizers.Adam(learning_rate=0.001, decay=1e-6)

autoencoder = keras.Model(encoder_input, decoder_output, name='autoencoder')

autoencoder.compile(opt, loss='mse')
print("STARTING FITTING")


history = autoencoder.fit(
movies_vector,
movies_vector,
epochs=25,

        )


print("ENCODER READY")
#USING THE MIDDLE LAYER 
encoder = keras.Model(inputs=autoencoder.input,
                    outputs=autoencoder.get_layer('encoder').output)

running this code gives me the following error:

required broadcastable shapes [[node mean_squared_error/SquaredDifference (defined at tmp/ipykernel_52/3425712667.py:119) ]] [Op:__inference_train_function_1568]

i have two questions:

1.how can i fix this error?

2.how can i improve my autoencoder so that i can use the compressed vectors to test for movie similarity?

CodePudding user response：

The output of your model is (batch_size, 260, 228, 1), while your targets appear to be (batch_size, 21300). You can solve that problem by either adding a tf.keras.layers.Flatten() layer to the end of your model, or by not flattening your input.
You probably should not be using 2D convolutions, as there is no spatial or temporal correlation between adjacent feature channels in most text embedding. You should be able to safely reshape to (150,142) rather than (150, 142, 1) and use 1D convolution, pooling, and upsampling layers.