How to reshape output of layer by duplicating in tensorflow-CodePudding

Suppose I have a layer output of (None,128). How can I reshape it into (None,200,128) where they duplicate the 128 values 200 times. For context, I am trying to concatenate the output of a dense layer that has output shape (None, 128) with another layer that has output shape (None,200,1024) so that it becomes shape (None,200,1152) which is why I want to reshape the first output.

And just for sanity check, is concatenating these to two layers even a good idea? I am still new to deep learning so I am not sure if I am approaching my problem in a correct manner.

Here is the code that would produce the first output:

def build_tabular_data_network(image_metadata, train_bn=True):
    
    x=KL.Dense(1024, name="metadata_network_dense1")(image_metadata)
    x=BatchNorm(name="metadata_bn1")(x,training=train_bn)
    x=KL.Activation('relu')(x)
    
    x=KL.Dense(512, name="metadata_network_dense2")(image_metadata)
    x=BatchNorm(name="metadata_bn2")(x,training=train_bn)
    x=KL.Activation('relu')(x)
    
    x=KL.Dense(256, name="metadata_network_dense3")(image_metadata)
    x=BatchNorm(name="metadata_bn3")(x,training=train_bn)
    x=KL.Activation('relu')(x)
    
    x=KL.Dense(128, name="metadata_network_dense4")(image_metadata)
    x=BatchNorm(name="metadata_bn4")(x,training=train_bn)
    x=KL.Activation('relu')(x)
   
    x = tf.expand_dims(x, axis=1)
    x = tf.repeat(x, repeats=200 , axis=1)
    
    return x

Here is the code that would concatenate the previous output('metadata_network') with a new output ('shared'):

def fpn_classifier_graph(rois, feature_maps, image_meta, metadata_network,
                         pool_size, num_classes, train_bn=True,
                         fc_layers_size=1024):
    """Builds the computation graph of the feature pyramid network classifier
    and regressor heads.

    rois: [batch, num_rois, (y1, x1, y2, x2)] Proposal boxes in normalized
          coordinates.
    feature_maps: List of feature maps from different layers of the pyramid,
                  [P2, P3, P4, P5]. Each has a different resolution.
    image_meta: [batch, (meta data)] Image details. See compose_image_meta()
    pool_size: The width of the square feature map generated from ROI Pooling.
    num_classes: number of classes, which determines the depth of the results
    train_bn: Boolean. Train or freeze Batch Norm layers
    fc_layers_size: Size of the 2 FC layers

    Returns:
        logits: [batch, num_rois, NUM_CLASSES] classifier logits (before softmax)
        probs: [batch, num_rois, NUM_CLASSES] classifier probabilities
        bbox_deltas: [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))] Deltas to apply to
                     proposal boxes
    """
    # ROI Pooling
    # Shape: [batch, num_rois, POOL_SIZE, POOL_SIZE, channels]
    x = PyramidROIAlign([pool_size, pool_size],
                        name="roi_align_classifier")([rois, image_meta]   feature_maps)
    # Two 1024 FC layers (implemented with Conv2D for consistency)
    x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (pool_size, pool_size), padding="valid"),
                           name="mrcnn_class_conv1")(x)
    x = KL.TimeDistributed(BatchNorm(), name='mrcnn_class_bn1')(x, training=train_bn)
    x = KL.Activation('relu')(x)
    x = KL.TimeDistributed(KL.Conv2D(fc_layers_size, (1, 1)),
                           name="mrcnn_class_conv2")(x)
    x = KL.TimeDistributed(BatchNorm(), name='mrcnn_class_bn2')(x, training=train_bn)
    x = KL.Activation('relu')(x)

    shared = KL.Lambda(lambda x: K.squeeze(K.squeeze(x, 3), 2),
                       name="pool_squeeze")(x)
    shared=KL.Concatenate(axis=2,name="pool_squeezed2")([shared,metadata_network])

    # Classifier head
    mrcnn_class_logits = KL.TimeDistributed(KL.Dense(num_classes),
                                            name='mrcnn_class_logits')(shared)
    mrcnn_probs = KL.TimeDistributed(KL.Activation("softmax"),
                                     name="mrcnn_class")(mrcnn_class_logits)

    # BBox head
    # [batch, num_rois, NUM_CLASSES * (dy, dx, log(dh), log(dw))]
    x = KL.TimeDistributed(KL.Dense(num_classes * 4, activation='linear'),
                           name='mrcnn_bbox_fc')(shared)
    # Reshape to [batch, num_rois, NUM_CLASSES, (dy, dx, log(dh), log(dw))]
    s = K.int_shape(x)
    mrcnn_bbox = KL.Reshape((s[1], num_classes, 4), name="mrcnn_bbox")(x)

    return mrcnn_class_logits, mrcnn_probs, mrcnn_bbox

These code would product this error:

ValueError: A `Concatenate` layer requires inputs with matching shapes except for the concat axis. Got inputs shapes: [(None, 200, 1024), (None, None, 128)]

CodePudding user response：

Concatenating two layers is not a "bad" idea, but it really depends on your use case. Regarding your first question, you can try something like this:

import tensorflow as tf

x = tf.random.normal((5, 128))
x = tf.expand_dims(x, axis=1)
x = tf.repeat(x, repeats=200 , axis=1)

y = tf.random.normal((5, 200, 1024))
tf.print('X shape -->', x.shape)
tf.print('Y shape -->', y.shape)
tf.print('Concatenated -->', tf.concat([x, y], axis=-1).shape)

X shape --> TensorShape([5, 200, 128])
Y shape --> TensorShape([5, 200, 1024])
Concatenated --> TensorShape([5, 200, 1152])

Or with Keras layers:

x = tf.random.normal((5, 128))
y = tf.random.normal((5, 200, 1024))

input1 = tf.keras.layers.Input((128,))
input2 = tf.keras.layers.Input((200, 1024))
repeated_input = tf.keras.layers.RepeatVector(n=200)(input1)
output = tf.keras.layers.Concatenate(axis=-1)([repeated_input, input2])
model = tf.keras.Model(inputs=[input1, input2], outputs=output)

tf.print(model([x, y]).shape)

TensorShape([5, 200, 1152])