Suppose I have a layer output of (None,128). How can I reshape it into (None,200,128) where they duplicate the 128 values 200 times. For context, I am trying to concatenate the output of a dense layer that has output shape (None, 128) with another layer that has output shape (None,200,1024) so that it becomes shape (None,200,1152) which is why I want to reshape the first output.
And just for sanity check, is concatenating these to two layers even a good idea? I am still new to deep learning so I am not sure if I am approaching my problem in a correct manner.
Here is the code that would produce the first output:
def build_tabular_data_network(image_metadata, train_bn=True):
x=KL.Dense(1024, name="metadata_network_dense1")(image_metadata)
x=BatchNorm(name="metadata_bn1")(x,training=train_bn)
x=KL.Activation('relu')(x)
x=KL.Dense(512, name="metadata_network_dense2")(image_metadata)
x=BatchNorm(name="metadata_bn2")(x,training=train_bn)
x=KL.Activation('relu')(x)
x=KL.Dense(256, name="metadata_network_dense3")(image_metadata)
x=BatchNorm(name="metadata_bn3")(x,training=train_bn)
x=KL.Activation('relu')(x)
x=KL.Dense(128, name="metadata_network_dense4")(image_metadata)
x=BatchNorm(name="metadata_bn4")(x,training=train_bn)
x=KL.Activation('relu')(x)
x = tf.expand_dims(x, axis=1)
x = tf.repeat(x, repeats=200 , axis=1)
return x
Here is the code that would concatenate the previous output('metadata_network') with a new output ('shared'):
def fpn_classifier_graph(rois, feature_maps, image_meta, metadata_network,
pool_size, num_classes, train_bn=True,
fc_layers_size=1024):
"""Builds the computation graph of the feature pyramid network classifier
and regressor heads.
rois: [batch, num_rois, (y1, x1, y2, x2)] Proposal boxes in normalized
coordinates.
feature_maps: List of feature maps from different layers of the pyramid,
[P2, P3, P4, P5]. Each has a different resolution.
image_meta: [batch, (meta data)] Image details. See compose_image_meta()
pool_size: The width of the square feature map generated from ROI Pooling.
num_classes: number of classes, which determines the depth of the results
train_bn: Boolean. Train or freeze Batch Norm layers
fc_layers_size: Size of the 2 FC layers
Returns:
logits: [batch, num_rois, NUM_CLASSES] classifier logits (before softmax)
probs: [batch, num_rois, NUM_CLASSES] classifier probabilities
bbox_deltas: [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))] Deltas to apply to
proposal boxes
"""
# ROI Pooling
# Shape: [batch, num_rois, POOL_SIZE, POOL_SIZE, channels]
x = PyramidROIAlign([pool_size, pool_size],
name="roi_align_classifier")([rois, image_meta] feature_maps)
# Two 1024 FC layers (implemented with Conv2D for consistency)
x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (pool_size, pool_size), padding="valid"),
name="mrcnn_class_conv1")(x)
x = KL.TimeDistributed(BatchNorm(), name='mrcnn_class_bn1')(x, training=train_bn)
x = KL.Activation('relu')(x)
x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (1, 1)),
name="mrcnn_class_conv2")(x)
x = KL.TimeDistributed(BatchNorm(), name='mrcnn_class_bn2')(x, training=train_bn)
x = KL.Activation('relu')(x)
shared = KL.Lambda(lambda x: K.squeeze(K.squeeze(x, 3), 2),
name="pool_squeeze")(x)
shared=KL.Concatenate(axis=2,name="pool_squeezed2")([shared,metadata_network])
# Classifier head
mrcnn_class_logits = KL.TimeDistributed(KL.Dense(num_classes),
name='mrcnn_class_logits')(shared)
mrcnn_probs = KL.TimeDistributed(KL.Activation("softmax"),
name="mrcnn_class")(mrcnn_class_logits)
# BBox head
# [batch, num_rois, NUM_CLASSES * (dy, dx, log(dh), log(dw))]
x = KL.TimeDistributed(KL.Dense(num_classes * 4, activation='linear'),
name='mrcnn_bbox_fc')(shared)
# Reshape to [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))]
s = K.int_shape(x)
mrcnn_bbox = KL.Reshape((s[1], num_classes, 4), name="mrcnn_bbox")(x)
return mrcnn_class_logits, mrcnn_probs, mrcnn_bbox
These code would product this error:
ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 200, 1024), (None, None, 128)]
CodePudding user response:
Concatenating two layers is not a "bad" idea, but it really depends on your use case. Regarding your first question, you can try something like this:
import tensorflow as tf
x = tf.random.normal((5, 128))
x = tf.expand_dims(x, axis=1)
x = tf.repeat(x, repeats=200 , axis=1)
y = tf.random.normal((5, 200, 1024))
tf.print('X shape -->', x.shape)
tf.print('Y shape -->', y.shape)
tf.print('Concatenated -->', tf.concat([x, y], axis=-1).shape)
X shape --> TensorShape([5, 200, 128])
Y shape --> TensorShape([5, 200, 1024])
Concatenated --> TensorShape([5, 200, 1152])
Or with Keras
layers:
x = tf.random.normal((5, 128))
y = tf.random.normal((5, 200, 1024))
input1 = tf.keras.layers.Input((128,))
input2 = tf.keras.layers.Input((200, 1024))
repeated_input = tf.keras.layers.RepeatVector(n=200)(input1)
output = tf.keras.layers.Concatenate(axis=-1)([repeated_input, input2])
model = tf.keras.Model(inputs=[input1, input2], outputs=output)
tf.print(model([x, y]).shape)
TensorShape([5, 200, 1152])